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FIELD OF THE INVENTION 

This invention relates to the field of cancers and in particular to nucleotide sequences of 
the fragile site FRA16D, of the FORi6D gene and amino acid sequences of its encoded 
proteins, as well as derivatives and analogs thereof and agents capable of binding 
5 thereto, and uses of these, such as in diagnosis and therapy. 

BACKGROUND OF THE INVENTION 

Cancers are a significant factor in mortality and morbidity, with onset rates of forms of 
cancer being quite high in all places of the world. Early detection greafly improves the 

10 chances of remission and considerably reduces the chance of the cancer metastasizing. 
The treatment of early stage cancers is also much more benign so that there are less 
severe residual effects resulting from the treatment. Accordingly early detecUon of 
cancers is a high priority in management of the diseases. Similarly treatment of various 
cancers are of mixed outcome and it is desirable to provide for alternative treatments at 

15 least for certain forms of cancers. 

Cancers are of many different types and severity, however the uncontrolled 
proliferation of cancers cells is invariably associated with damaged DNA of one form or 
another. Some types of cancer are familial in the sense that there is an increased risk of 
20 contracting cancer, but the hereditary characteristics in most cancers are not simple and 
there is only usually a few fold increased risk among faniily members as compared to 
the general population. The DNA damage in most cancers are associated with somatic 
mutations the acquisition of which is thought to be associated with exposure to certain 
environmental factors. 

25 

A very large number of genes have been identified as being associated with the onset of 
cancer and this reflects the complexity of the regulation of normal cellular proliferation. 
These genes can be categorised into three groups a first of which includes the so called 
oncogenes or protooncogenes which are often associated with positive control 

30 elements, enhancing cellular proUferation in the normal cellular cycle. Certain 

mutations in these positive control elements trigger uncontrolled proliferation. A 
second group are the so called tumour suppressor genes, which are genes that normally 
suppress prohferation, and inactivation or reduction in activity of these leads to 
abnormal proUferation. These tend to act in a recessive fashion. A third group are the 

35 so-called mutator genes which are normally responsible for maintaining genome 
integrity during the proliferative cycle, and if these are defective then the general 
mutation rate increases and the consequent chance of providing for a transforming 
mutation increases. 
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One mapping technique to locate the site of chromosomal lesion in a cancer cell is 
known as the loss of heterozygosity (LOH) technique. Eukaryotes have two copies of 
each chromosome, apart from the sex chromosomes, and as a result cancers that result 
from mutations in a tumour supressor generally require two mutations. Sometimes one 

5 mutation will be inherited, and a second mutation is required to trigger the cancer 
leading to loss of function of both copies of the gene in the individual. Quite often 
these secondary mutations will be deletions and their location can be detected by 
checking the presence of highly polymorphic genetic markers from the tumour tissue 
and from another site such as blood. The markers that are heterozygous in normal 

10 tissue and have become homozygous in the cancer tissue can give an indication of the 
lesion concerned. 

The LOH technique is however quite difficult to routinely perform and interpret 
reliably, this is particularly so because any tumour sample usually is also contaminated 

15 by non-tumour tissue, and it is at times difficult to distinguish because of a decreased 
relative intensity, and quantitative amplification techniques will often need to be 
employed. Another limitation relates to the availability of a suitably dense array of 
markers which generally leads to the detection only of larger deletions. A single 
tumour may have LOH in many distinct regions, but LOH will only be detected in those 

20 regions that have been tested. 

The use of these LOH studies have identified a number of sites some of which 
correspond to regions of the chromosome termed fragile sites. 

25 Fragile sites have been proposed to have a determining role in cancer associated 

chromosomal instability. There are in excess of 100 fragile sites in the human genome 
of which the fragile site FRAllB is located within the CBL2 proto-oncogene (Jones et 
al, 1994, 1995) and the FRA3B, FRA7G and FRA16D sites have been located within 
or adjacent to regions of instability in cancer cells (Ohta et al, 1996; Sozzi et al., 1996; 

30 Engelman et aL, 1998; Huang et al, 1998a,b). 

There are two distinct forms of chromosomal anomaly referred to as fragile sites 
(Sutherland et aL, 1998)). The 'rare* form is polymorphic in the population and is 
accounted for by the expansion of repeat DNA sequences beyond a copy number limit. 
35 The ^common' form is present at many loci in all individuals. Despite determination of 
the complete sequence analysis of the common fragile site, FRA3B (Boldog et al., 
1996; Inoue et aU 1997; Mimori et a/., 1999) and the partial sequence analysis of the 
common fragile sites, FRA7G and FRA7H (Huang et al., 1998a,b; Mishmar et al, 
1998) the molecular basis for common fragile sites is not yet understood. 
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Fragile sites are also distinguished by the culture conditions required for their 
induction. Common fragile sites are (mainly) induced by aphidicolin, whereas the rare 
fragile sites are induced by either high or low concentrations of folate or the AT-rich 
binding chemicals such as distamycin A or by bromodeoxyuridine. The role of 
chromosomal fragile sites in human genetic disease was thought to be restricted to 
fragile X syndrome caused by the FRAXA fragile site, however a mild fom of mental 
retardation has been associated with FRAXE and the FRAUB fragile site appears to 
predispose to 1 iq breakage leading to some cases of Jacobsen syndrome. 



Recent detailed molecular analysis of fragile site loci has demonstrated that the common 
fragUe site FRA3B is located within a region subject to localised deletion aria that this 
deletion is frequently observed in certain forms of cancer (Ohta etal, 1996; Sozza et 
al., 1996). FRA3B lies proximal to the major region of LOH on chromosome 3p 
previously shown to be responsible for deletion of the VHL tumour suppressor (Gnarra 
et al., 1994). The cancer-associated FRA3B deletions can result in inactivation of a 
gene (FHIT -Fragile Histidine Triad) which spans the fragile site (Croce et al US 
patent 5928884). The FHIT gene product has been shown to have a role in tumour 
growth (Siprashvilli et al., 1997) but quite what the significance or nature of that role is 
20 subject of active research at the present. 

Another common fragile site FRA 7G has also been shown to be located within an 
about 1Mb region of frequent deletion in breast and prostate cancer (18.19) as well as 
squamous cell carcinomas of the head and neck, renal cell carcinomas, ovarian 
adenocarcinomas and colon carcinomas (20). The human caveolin-1 and -2 genes are 
located within the same commonly deleted region as FRA 7G. Caveolin-1 has been 
shown to have a role in the anchorage dependent inhibition of growth in NIH 3T3 cells 
(21). The caveolins are therefore candidates for the tumour suppressor gene presumed 
to be located in the FRA 7G region (20). 



Another common fragile site which is aphidicolin inducible is the FRA16D site. 
FRA16D has been localised at 16q23.2. within a large overlapping region of 
chromosomal instability in breast and prostate cancer as defined by 
loss-of-heterozygosity (24,25). One study has found that a significant proportion 
(77%) of breast cancers carries a deletion at 16q23.2, including the maricer D16S518 in 
the immediate vicinity of FRA 16D (24). 

There has been no characterisation of a nucleic acid or protein associated with the 
FRA16D site and the physical location of FRA16D has not yet been determined. Such 
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a characterisation is desirable to enable potentially early diagnosis and assessment of 
risk as well as potentially providing for a therapeutic treatment. 

SUMMARY OF THE INVENTION 

5 

The inventors have produced a detailed physical map of the FRA16D region which 
provides markers to identify a relationship between this fragile site and DNA instability 
in neoplasia and which, further, may allow better diagnosis of cancers associated with 
the region. This analysis reveals the existence of an intimate relationship between the 
10 location of FRA16D and homozygous deletions in various tumours, culminating in the 
coincidence of two tumour cell DNA breakpoints with the most likely position of the 
fragile site. 

The inventors have also characterised the nucleic acid associated with FRA 1 6D 
15 especially by nucleic acid sequencing. Analysis of the DNA sequence has identified a 
number of introns and exons which are found to exist in four different splice variants of 
what will be termed protein FOR16D. RNA analysis has also been conducted and thus 
far two species of mRNA associated with the region have been detected. 

20 In a first aspect the invention could be said to reside in a method of detecting genetic 
variations of a I6q23.2 target in the I6q23.2 region of the chromosome, said method 
comprising the steps of contacting target nucleic acid with one or more oligonucleotides 
suitable for use as hybridisation probe or PGR priming specific for binding the 16q23.2 
specific target, and ascertaining the binding of said oligonucleotide. 

25 

It will be understood from the specification that the 16q23.2 specific target might be 
selected to be within the group comprising the FOR16D gene, the FRA16D site, or 
mRNA encoding FOR16D protein or all of these coUectively. The target may include 
chromosomal rearrangements and mutations thereof and the rearrangements or 
30 mutations may, in one form, be cancer associated. The variations may include markers 
in the region such as set forth in this specification including in figures 1, 2, 7 and 8. 

The I6q23.2 target within the FOR16D gene might be selected from one or more of the 
group comprising exons a, I, z, w, 2, 3, 4, 5, 6 or x or introns located therebetween or 
35 control elements in other adjacent regions that effect an altered expression of the 

FOR16D gene. Such adjacent regions may have a promoter, enhance elements or other 
regulatory elements. The target may be any one of the splice variants currently 
identified as FOR16DI, FOR16DU, FOR16Dni or FOR16DIV or it might include 
other combinations of two or more of the exons. 
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It is noted in particular that breakpoints of three out of five 16q23.2 translocations 
associated with multiple myeloma map within the alternate splice of this FOR16D 
intron, that is, between exons 4 and x, and in one form a preferred target is the intron 
5 between exons 4 and x or a portion thereof. 

In some circumstances the method might be used to detect any rearrangements in a 
larger target area. Thus it might be desired to use a plurality of oligonucleotide which 
might be selected to bind to a range of target binding sites within the 16q23.2 specific 

10 target to detect for a range of changes. This might be used for example to detect for 
chromosomal rearrangements such as deletions within the FRA16D site or beyond that 
in the broader 16q23.2 region. The plurality of oligonucleotides or a plurality of 
specific binding sites of the 16q23.2 target are preferably spacially separated so that 
binding of each of the plurality of oligonucleotides or binding to the plurality of specific 

1 5 binding sites can be separately ascertained. The spacial separation might, for example, 
be conveniently provided as an array on a solid support, for example in a form that is 
conmion referred to as a gene chip (see for example patent specifications US 5288514 
and US 5593839). Instead of a plurality of oligonucleotides it may be desired that the 
target be probed by a single oligonucleotide. 

20 

Alternatively the target area might be small, thus for example the method might be used 
to ascertain the presence or absence of a particular mutation or allelic variation in the 
16q23.2 target. Thus for example a target of the z, w, 5 or 6 or x exon will 
distinguish between FOR16DI, FOR16DIV, FOR16DII and FOR16Din U^anscription 

25 variants. A small target area might also be adequate for use with gross chromosomal 
rearrangements in so far as this might be used to determine the presence or absence of 
junctions of known chromosomal rearrangements, or altematively the binding or non 
binding of one or more of a plurality of oligonucleotides. The target area might also be 
selected to allow for assessment of the presence or absence of cancer associated point 

30 mutations or small DNA rearrangements,using suitably selected oligonucleotides. 

The base sequence of the oligonucleotide chosen will depend upon several factors 
known in the art. Primarily the sequence of the oligonucleotide will be determined by 
its capacity to bind to the target nucleic acid sequence. The nature of the sequence will 
35 depend to some extent on the stringency of the hybridisation required, and whether or 
not it is desired for one oligonucleotide to detect variation in sequence or not. If 
variation in one nucleotide is required the stringency of the hybridisation will be high. 
The length of the oligonucleotide will also be detenmined by the stringency of the 
reaction required. 
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The binding might be by in situ hybridisation of a chromosomal spread, or other 
suitable spacial arrangement of the target region such as for example on a so called gene 
chip. Such hybridisation methods will generally provide for an oligonucleotide and be 
5 capable of binding the target over a span of at least 15 nucleotides. In the case of 

hybridisation techniques the oligonucleotides will generally carry a label which can be 
detected by known measuring methods, especially when bound to the 16q23.2 target. 
Such labels might include radiolabels such as ^^p or a fluorescent marker. 

10 The method might require a preamplification step whereby the target nucleic acid is 

amplified, to make it easier to ascertain the binding or non binding of the nucleic acid to 
the target site. 

On the other hand the oligonucleotide might be suitable for amplification of a segment 
15 of the target nucleic acid such as by PGR, in which case the size of the target may be 
somewhat different. With this variation two oligonucleotides might be selected, to 
provide for amplification of at least part of the target nucleic acid, at least one of the 
oligonucleotides is required to bind in the target. 

20 The target nucleic acid might be presented in any one of a number of physical forms. 
Nucleic acid from an individual rmght be isolated and perhaps digested by a restriction 
enzyme and spread out such as by electrophoresis on an agarose or polyacrylamide gel, 
so that binding of the oligonucleotide can be effected whUst the target nucleic acid is 
supported by the gel or this might be supported on other solid medium such as a gene 

25 chip or a metaphase chromosomal spread. Alternatively the oligonucleotide or 

oligonucleotides might be fixed, and the target nucleic acid might either be diminished 
in size, or not, and then binding of fragmented targets to the fixed oligonucleotide 
determined. 

30 The target nucleic acid might be in the form of chromosomal DNA, or might be cDNA 
or mRNA. 

This method might also be used to detect other variants, homologs or analogs of the 
FRA16D site, FOR16D gene, or other nucleic acid sequences disclosed in this 
35 specification. Thus it might be, for example desirable to determine analagous gene in 
livestock, domestic, laboratory or sporting animals. Alternatively one might wish to 
determine another analogous protein that plays a similar role in humans. 
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In a second aspect the invention relates to a method of detecting the number of alleles 
for one or more markers in the 16q23.2 target, and this may be a means of perhaps 
providing a measure of the loss of heterozygosity in an individual. This aspect of the 
invention therefore relates to locating a deletion that overlaps with the FRA16D region. 
The method might be achieved by providing a first set of one or more oligonucleotides 
and a second set of one or more oligonucleotides the first set of oligonucleotide being 
specific for a first variant of the target nucleic acid, the second set of oligonucleotides 
being specific for a second variant of the target nucleic acid, the first and second set of 
oligonucleotides being labelled so as to be capable of being distinguished, and the 
method comprising the steps of comparing the proportion of binding of the fu-st and 
second set of oligonucleotides. A method of this sort is set forth in US patent 
specification 5928870 to Lapidus et al, which for puiposes of practicing the invention 
is incorporated herein by reference. 

It will be understood that the above method is useful in categorising the risk of 
contracting certain types of cancer associated with the FRA16D fragile site or other 
portion of the 16q23.2 region. 

In a third aspect the invention could be said to reside in a method of determining the 
level of expression of the FOR16D gene or any one or more exon thereof, by 
determining the level of mRNA expression using a probe specific for the FOR16D gene 
or exon thereof This might be used to determine the dysregulation of FOR16D 
expression. It will be understood that it may be desired to also determine the level of 
expression of variants of the gene or exons including rearrangements and mutants 
including those associated with cancers. This is likely to give a prognosis in relation to 
at least certain cancers that are currently contracted or perhaps an indication of the risk 
of contracting one or more types of cancer. 

In a fourth aspect the invention could be said to reside in an isolated nucleic acid 
molecule selected from the group comprising 



a) 


nucleic acids sequences disclosed in the figures hereto or parts thereof 


b) 


FRA16D site 


c) 


FOR16D gene, or exons thereof 


d) 


mRNA of the FOR 16D gene 


e) 


cDNA of the FOR16D gene 


0 


variants of the above including, chromosomal rearrangements and 




mutations of sequences set out in a) to e) including those variants 




associated with cancers 
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g) nucleic acid sequence capable of hybridising specifically to any 

sequence of a to e above or its complement, and especially those capable 
of doing so under stringent conditions. 

5 The nucleic acid molecule might include a mosaic from within the above molecules such 
as a combination of tv^o or more of the group comprising the following, exon a, l,z, 
w, 2, 3, 4, 5, 6, or z or introns located therebetween or control elements in other 
adjacent regions that effect an altered expression of FOR16D, and it will be understood 
that such a mosaic includes a molecule encoding cDNA of variants of the FOR16D 

1 0 protein, whether a wild type allele, a mutated version, or otherwise rearranged. It will 
thus be understood that the invention includes antisense molecules to any regions of 
control that might be contemplated above. Such antisense molecules may be used to 
vary the expression of such protein as are produced by the FOR16D gene or perhaps 
adjacent genes such as the c-MAF gene. 

15 

It will be understood that such nucleic acids include portions of nucleic acids that are 
suitable for use as primers or probes. 

The invention may also be said to include nucleic acids encoding a tumour associated 
20 gene from a human or animal capable of hybridizing with any nucleic acid of the fourth 
aspect of the invention. 

In a fifth aspect the invention could be said to reside in a recombinant vector including 
one or more nucleic acid sequences as set out above, and preferably operably linked to 
25 a control element such as might include a ftmctional promoter. The recombinant vector 
might be used as an expression vector to produce or overproduce FOR16D protein or 
variants thereof, or perhaps overproduce nucleic acids associated with the FOR16D 
gene such as an antisense molecule. Suitable vectors are generally available 
commercially or may be constructed as described elsewhere or as is known in the art. 

30 

In a sixth aspect the invention could be said to reside in an isolated protein molecule, 
the protein molecule being selected fi-om the group comprising the following: 

a) a FOR 1 6D protein, or 

b) a mutant or variant FOR 1 6D protein which might optionally be 
35 associated with a cancer 

In a seventh aspect the invention could be said to reside in a polypeptide produced by 
any two or more exons selected from the group comprising a, 1, z, w, 2, 3, 4, 5, 6, x 
joined, said exons being either as complete exons or partial, and may be variants. 
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The invention might also encompass a purified cancer associated protein including a 
string of amino acids unique to a FOR16D protein and more particularly as set out in 
any one of figures 13 A to D, preferably said amino acid string being at least 10 amino 
acids long and exhibiting at least 70% amino acid homology more preferably at least 
90% homology. 

The protein may have an oxidoreductase domain or may have a role in DNA replication 
of chromosomal division. 

In one form the purified cancer associated protein includes an amino acid string with an 
amino acid sequence homology of greater than 70% but more preferably greater than 
90% with the amino acid string LPPGWEERT, and is associated with DNA replication 
or chromosomal division. Such a purified protein may be used for treatment of certain 
cancers. 

In another form the purified cancer associated protein includes an amino acid string 
with an amino acid sequence homology of greater than 70% but more preferably greater 
than 90% with an amino acid string selected from the group comprising: 

VVVVTGANSGIG, MTLDLALLRSVQ, PLDVLVCNAA and 

VNHLGHFYL. 

In an eighth aspect the invention includes an agent capable of selectively binding a 
FOR16D protein or fragment or variant thereof Such agents may be particularly useful 
in diagnostic methods. Such an agent may also be used to bind a protein containing a 
string of amino acids unique to FOR16D or variant thereof and in particular such 
variants that are currently known to be associated with one or more forms of cancer. 
The agent may selectively bind to the variant FOR16D as compared to an FOR16D 
protein not associated with cancer. Such an agent might be an agonist or an antagonist 
of FORI6D function. It might therefore be desired to provide for a number of agents 
each capable of selectively binding to a separate one of a number of variants of 
FORI6D so that it is possible to distinguish between variants. Thus for example it 
might be desired to target the C terminus of respectively FOR16DI, FOR16Dn, 
FOR16DIII and FOR16DIV to distinguish between these three proposed forms. The 
invention therefore also encompasses a method of detecting variants of the FOR16D 
protein. Measuring the relative levels of these four and other forms of FORI 6D protein 
is likely to give an indication of regulatory perturbations which may be associated with 
certain cancers. 
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The nature of the agents can vary depending on their intended use. Thus for a 
diagnostic method an antibody or fragment thereof, such as an Fab fragment, of a 
recombined molecule carrying the variable region of an antibody recognising the 
desired portion of the F0R16D may be adequate. The antibody might be polyclonal 
however preferably the antibody is a monoclonal antibody prepared by known 
techniques. 

Alternatively small molecules capable of binding the desired portion of the FOR16D 
protein may be used, such small molecules might include peptides, proteins, nucleic 
acids or sugars or other organic molecules. These can be isolated by screening using 
known techniques from libraries of suitable compounds. Such small molecules can 
then be tested for antagonist or agonist properties to potentially provide a therapeutical 
agent which have the potential to be used in the treatment of cancers. These agents 
would be administered by clinicians in an appropriate manner. 

Also useful therapeutically might be the provision of an isolated protein of the seventh 
aspect of the invention, particularly those forms that mimic the action of a wild type 
FOR16D, and perhaps simply the purified FOR16D. It is anticipated that the FOR16D 
protein in at least one of its forms is a tumour suppressor, that is, its absence increases 
the risk of aberrant cell division leading to a cancer. Accordingly one form of therapy 
may include the administration of such a protein to an individual who is considered at 
risk, particularly if they are found to have a faulty FOR16D protein. Such 
administration would be in conformity with normal practices in a suitable excipient. It 
may also be the case that the aberrant FOR16D protein actively enhances 
tumourigenesis and accordingly it might be appropriate to administer an antagonist of 
the aberrant variant at the same time. Alternatively the administration of the antagonist 
on its own may be of therapeutic benefit. 

Another form of treatment which is becoming increasingly contemplated is to provide 
for a method of gene therapy and one method of undertaking cell therapy is to provide 
for certain progenitor cells which include incorporated therein a vector capable of 
producing an appropriate form of FOR16D protein. Accordingly a ninth aspect the 
invention could' be said to reside in a recombinant host cell having stably inserted 
therein DN A of any one of the forms of DNA contemplated in the third aspect of the 
invention. In preference the DNA is capable of producing a tumour suppressing form 
of FOR16D, and most conveniently this will be a wild-type form of FOR16D, which 
may simply be a cDNA molecule or the F0R16D gene. Alternatively however it may 
also be desired to have a host cell which has a DNA sequence capable of producing an 
antisense molecule in the case where an aberrant tumour promoting form of the 
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. FOR16D molecule is produced by the individual to be treated, the antisense capable of 
reducing the level of expression of the FORi6D molecule. 

Methods of gene therapy are not linuted to cases where the appropriate nucleic acid is 
5 delivered in a host cell, but also includes the administration of the nucleic acid 
specifically to the site of interest. 

The recombinant host cell may not necessarily be used for therapeutic purposes, it may 
also be used for over-expression of the protein, or a nucleic acid associated with 
10 FOR16D, or the 16q23.2 region, and may therefore be bacterial, yeast, plant, animal, 
preferably mammalian or human. 

Additionally the invention contemplates the provision of a transgenic non-human animal 
carrying recombinandy altered or overexpressing 16q23.2 DNA, preferably FRA16D 
1 5 or FOR 1 6D gene, or odier DNA of the fourth form of this invention. The recombinant 
DNA might be incorporated into the chromosome of the host, alternatively the host cell 
may carry said recombinant DNA in a self replicating element such as a plasmid. 

The agents of the eighth embodiment may be used for level of expression of FOR16D, 
20 variants or exons thereof, to determine whether there is an altered level of expression. 
Thus a western blot using a labelled agent may be used for the purpose using known 
techniques. This is another means of measuring dysregulation of expression. 

BRIEF DESCRIPTION OF THE DRAWINGS 

25 

Figure 1: Positional cloning of FRA16D and location of loss of heterozygosity 
and translocation in cancer. 

A. The locations of loss-of-heterozygosity regions in breast and prostate 
30 cancer and the approximate location of the FRA16D fragile site are 

indicated with respect to genetic markers (downward arrows) in the 
16q23.2 region. Markers in the vicinity of F/2A7(JD are shaded. The 
approximate location as deteraiined by Chesi et aL(\) of multiple 
myeloma breakpoints and the c-AfAF gene (bar) are also shown by 
35 upward black arrows. Not to scale. 

B. Map of the con tig of YAC subclones across the FRA16D region with 
respect to genetic markers and FRAI6D. Open boxes indicate those 
YACs which map by fluorescence in situ hybridisation proximal to 
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FRAI6D, grey boxes are those which span FRA16D and black boxes 
indicate those YACs which map distal to FRA16D. Not to scale. 

Positional cloning of FRA16D and the extent of heterozygous and 
homozygous deletion in the AGS tumour cell line, 

A. Pulsed-Field gel map of -IMb of the 'Right Hand Side' (RHS) of 
YAC My80lB6 and the location of BACs, genetic and STS markers 
(key markers are boxed). Restriction sites between Afma336yg9 and 
WI2755 are shown in B. The AGS stomach cancer cell line 
homozygous deletion is indicated - shaded circles denote the presence 
and open circles the absence of PGR products for the STS m'arfcers. 
Maximal region of heterozygous deletion in AGS cell line is indicated by 
polymorphic D16S518 and D16S3029 PGR products, indicated as A 
and B alleles. The two AGS cell line chromosome 16s are indicated by 
shaded bars. 

B. Restriction map of the critical FRA16D region (Afma336yg9 to 
D 1653029) showing the location of key members of the lambda 
subclone tile path used for FISH in figure 3. Clones designated A,-n are 
from 325M3; others are from 801B6. Open boxes represent those 
subclones found to map proximal (on the basis that >85% of their FISH 
signals were proximal to FRA16D), grey boxes those which appear to 
span the fragile site (less than 85% on one side or other ofFRA16D) 
and black boxes those which are distal to the fragile site (on the basis 
that >85% of their FISH signals were distal to FRA16D), X clones 
which gave high background on FISH were not scored. These and 
other X clones for which FISH data were not obtained are included as 
thin boxes. STS localisation of the AGS homozygous breakpoints are 
indicated by the presence (shaded circles) and absence (open circles) of 
PGR products. 

Fluorescence in situ hybridisation (FISH) of lambda subclones against 
FRA16D expressing chromosomes. 

Each panel contains two FRA16D expressing partial metaphases, with 
and without FISH signal merged. In each case the width of the gap or 
break at the fragile site is greater than the width of the chromatid, (a) 
^04 showing signal proximal to FRA16D; (b) X181 showing signal 
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proximal and distal to FRA16D; (c) X,191 (upper) and X8 (lower) 
showing signal distal to FRA16D, Images of metaphase preparations 
were captured by a cooled CCD camera using the ChromoScan image 
collection and enhancement system (Applied Imaging Int. Ltd.). FISH 
5 signals and the DAPI banding pattern were merged for figure 

preparation. 

Fluorescence in situ hybridisation mapping of the lambda subclone tile 
path across FRA16D. 

The individual lambda clones were scored against chromosomes where 
the FRA16D gap or break was greater than the chromatid width. Each 
increment represents a single FISH signal, n = number of chromosomes 
scored. Scores were plotted as proximal (p) and distal (d) with respect 
to FRAI6D. Maximum location for FRA16Ds indicated by arrows. 
Lx)cation of BAC clones 325M3 and 353B 15 is also shown. The boxed 
lambda contig subclones indicate those for which FISH signal results 
with respect to the FRA16D fragile site were obtained - open boxes, had 
>85% signal proximal to FRA16D; grey boxes, spanning (<85% signal 
on one side or other of FRA16D) and black boxes, had >85% signal 
distal to FRA16D, While this figure is not to scale the location of the 
lambda clones can be determined from their position in figure 2. Thin 
boxed lambda clones are those for which FISH data was not obtained. 

25 Figure 5: Duplex PCR deletion detection at the FRA16D locus in tumour cell 
line^. 



Figure 4: 

10 
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PCR products from the duplex of STSG-10102 and dystrophin DMD 
Pm were subjected to agarose gel electrophoresis and ethidium bromide 
30 staining. Template DNAs were seven tumour cell lines and blood bank 

and no DNA controls. Markers are Hpall digested pUC19. The position 
of the STSG-10102 and DMD Pm PCR products are indicated by large 
grey-filled arrows while the primer dimer PCR artefact is indicated by a 
small white arrow. 

35 

Figure 6; Is a diagrammatic representation of FOR16D transcripts with respect to 
FRA16D and common homozygous deletions. A. summarises the data 
in figure 2. B shows the position of two B ACs and below that shows 
the DNA sequences that have been obtained. C. shows three of the four 
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predicted variants of FOR16D, and indicates the ESTs that have been 
utilised to determine the open reading frames of the introns that 
collectively provide for the alternate splice variants of FOR16D 
transcripts. Also shown are the sequence positions of the exons shown 
5 from the position of the respective EST from which sequence was 

obtained. 

Figure 7 Is a second diagrammatic representation of FOR16D alternate transcripts 
with respect to FRA16D and common homozygous deletions. A. and 

10 B. are duplications of Figure 1 (above). C moving from top to bottom 

shows to relative position of certain mutations relative to a restriction 
enzyme map of the YAG My801B6, as well as the relative location of 
two further YACs My891F3 and My972D3. Below that are shown the 
position of four BACs, below that are shown the position of deletions in 

15 the cell lines AGS and HCTl 16, also shown is the position of the c- 

MAF oncogene. Below that are shown the regions that are sequenced, 
and the location of three multiple myeloma translocation break points. 
Following that are shown the four known alternate spliced transcripts 
and a listing of the EST that confirm the position of the four transcripts. 

20 

Figure 8 is a diagrammatic representation of four of the predicted splice variant 
transcripts as well as representation of a Northern blot analysis of RNA 
from various physical locations indicated using a portion of exon 3 as a 
probe. A similar result is found when a probe from exon X is used. 

25 

Figure 9 is a composite DNA sequence of the predicted FOR16DI transcript The 
composite has been constmcted by conjoining ESTs as indicated. 

Figure 10 is a composite DNA sequence of the predicted FOR16DII transcript. 
30 The composite has been constructed by conjoining ESTs as indicated. 

Figure 11 is a composite DNA sequence of the predicted FOR16DIII transcript. 

The composite has been constructed by conjoining ESTs as indicated. 

35 Figure 12 is a composite DNA sequence of the predicted FOR 1 6DIV transcript. 

The composite has been constructed by conjoining ESTs as indicated. 
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Figure 13 are composite amino acid sequences predicted for the sequences for 

FOR16DI, F0R16Dn, F0Rl6Dni and FOR16DIV as shown in figures 
9 to 12. 



5 Figure 14 sets out certain amino acid homologies of the predicted amino acid 
sequence for FOR16DIV and FOR16DI, using the Blast program 
(Altschul €t al (1997) Nucleic Acids Res. 25:3389-3401) and the 
swissprot database. Each comparison sets out the Swiss prot number 
assigned to the sequence compared with, the FOR16D amino acid 
10 sequence is on top (:) indicates sequence identity and (+) indicated 

conserved substitution the bottom sequence of each comparison is the 
sequence accessed from the swissprot database. 

Figure 15 sets out DNA sequences for each of the exons identified for the 
15 FOR 16D protein. 

Figure 16 is about 270kb of DNA sequence that overlaps and defines within it the 
FRA16D fragile site, which is shown to reside between exons 4 and 5. 

20 Figure 17 is DNA sequence for contig #208 as indicated in figure 6, and which 
encompasses exon 3, 

Figure 18 is DNA sequence for contig #779 as indicated in figure 6, and which 
encompasses exon 2, 
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DETAILED DESCRIPTION OF THE INVENTION. 
EXAMPLE 1 - MAPPING OF THE FRA16D FRAGILE SITE 



30 Materials and methods 

Isolation of DNA probes and YACs in the FRA16D region 
Nine DNA probes, ACH202 (D16S14), c311F2, c302A6 (D16S1075), c301F10 
(D16S373), 16-87 (D16S181), c306D2, 16-08 (D16S162), c307A12 and CRI-0119 
(D16S50) which had been physically mapped into the 16q23 region (30) were chosen 

35 for fluorescence in situ hybridisation (FISH) against FRA16D expressing 

chromosomes. Four of these markers mapped within the same somatic cell hybrid 
breakpoint interval defined by the cell lines CYl 13(P) and CY121 (30). One of these, 
C306D2 mapped proximal to FRA16D by FISH while the others, c307A12, CRI-0119 
and 16-08 mapped distal to FRA16D. These probes were therefore used as starting 
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points to isolate a contig of cloned DNA spanning FRA16D. In the Los Alamos 
National Laboratory database (www-ls.lanl.gov) an STS sequence from c306D2 was 
found within the GEPH YACs My903D9, My912D2 and My933H2 while an STS in 
C307A12 was found in My891F3 and My972D3. These YACs were obtained from 

5 CEPH and the prepared DNA subjected to Pst I digestion. Southern blotted and probed 
with 16-08, 16-87, CRI-01 19, c306D2 and c307A12 in succession in order to confirm 
their content. In addition a search of the Whitehead Institute database 
(www-genome. wi.mit.edu) revealed that the two sets of YACs were joined into a 
contig by the YACs My801B6, My845D9 and My944D8. Each of these YACs was 

10 used as template DNA to assess STS content (D16S518, Afma336yg9, WI2755, 

STSG-10102 and D16S3029) and subjected to FISH to assess position with respect to 
FRA16D (Figure IB). 

Additional probes, STSs and BACs from the FRA16D region 

15 Additional probes were generated from the YAC 801B6 by subcloning Pst I digests of 
YAC DNA and screening with total human DNA as probe. These subclones were 
digested with Hinc 11 to identify and isolate non-repetitive DNA fragments as probes. 
This generated markers HI 3m, H22s, H23m, H29m and H40m. Genome System Inc. 
BAC library filters were screened with the probes D16S518, Afma336yg9, WI-2755, 

20 STSG-10102, H22s, H29M and D16S3029 and nine BAC clones including 379C2, 
325M3 and 353B15 were identified. An additional STS, named 2AS, was established 
by 'bubble' PCR from the end-fragment of BAC 353B15 and was isolated as described 
by Gecz et al (3 1). Briefly, the BAC DNA was digested with Alu I and ligated to the 
annealed bubble linkers. The final PCR w as carried out with a combination of Not I-A 

25 bubble primer and Sp6-promoter primer as described except an annealing temperature 
of 55°C was used. These STSs and hybridisation probes were used to establish 
restriction maps of the YAC My801B6 and the BACs (Figure 2A). 

Subcloning and contig assembly 
30 The YAC My801B6 and the BAC 325M3 were used as DNA templates for establishing 
a lambda subclone libraries in XGEMl 1 or X,GEM12 vectors (Promega) according to 
the supplier's protocol. My801B6 and 325M3 appeared to have intact human DNA 
inserts, based on comparative pulsed field gel mapping of the YACs and BACs across 
the region (data not shown). 

35 

Fluorescence in situ hybridisation 

FRA16D-expressing metaphases were obtained from peripheral blood lymphocytes by 
standard methods. Briefly, cultures were grown for 72 hours in Eagle's minimal 
essential minimal medium, minus folic acid, supplemented with 5% fetal calf serum. 
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Induction of FRA16D was with 0.5uM aphidicolin (dissolved in 70% ethanol) added 
24 hour$ before harvest (32). DNA clones were nick-translated with biotin-14-dATP, 
pre-associated w ith 6ug/ul total human DNA, hybridised at 20ng/ul to metaphase 
preparations, and detected with one or two amplification steps using biotinylated 
5 anti-avidin and avidin-FITC as previously described (33). Hybridisation signal was 
visualised using an Olympus AX70 microscope fitted with single pass filters for DAPI 
(for chromosome identification), propidium iodide (as counterstain) and FITC. 
FRAl6D-expressing chromosomes were scored for signal only when the width of the 
fragile site gap was greater than the width of one chromatid, so that signal was 

10 unambiguously proximal or distal to the gap (Figure 3). Only fluorescent dots which 
touched chromatin were scored as signal - the few fluorescent dots which lay within the 
fragile site gap but did not touch proximal or distal segments were therefore not scored 
as signal since there was a possibility that they comprised non- specific background. 
Lambda clones which gave very poor FISH results (high non- specific hybridisation to 

15 other chromosomes) were hot able to be scored with respect to the fragile site. This is 
likely to be due to the large amount of repetitive DNA within these particular clones - 
see below. 

Tumour cell lines 

20 The tumour cell lines LoVo, HT29, Kato HI, S W480, AGS, MDA-MB436 and LS 180 
were purchased from the American Type Culture Collection. LoVo and AGS cells 
were grown in Hams F12 medium with 2mM L-glutamine, \Q% fetal calf serum in 5% 
CO2, Kato ni cells were grown in RPMI1640 medium with 2mM L-glutamine, 20% 
fetal calf serum in 5% CO2, HT29 cells were grown in McCoy's 5a medium with 

25 1 .5mM L-glutamine, 10% fetal calf serum in 5% CO2, LS 180 cells were grown in 
Eagle's minimal essential medium with 2mM Lglutamine and Earle's salts and 
non-essential amino acids, 10% fetal calf serum in 5% CO2, SW480 cells were grown 
in Leibovitz's L15 medium with 2mM L-glutamine and 10% fetal calf serum, 
MDA-MB^36 cells were grown in Leibovitz's Li5 with 16(ig/ml glutathione and 

30 0.026units/ml insulin. 

PCR detection of homozygous deletion in tumour cell DMAs 
PCRs for the detection of individual sequence tagged sites from across the FRA16D 
region were duplexed (34) with control PCRs from the dystrophin gene on the X 
35 chromosome (DMD Pm or DMD49, ref 35) or the APRT gene on chromosome 16 

(33). This allowed verification that the PCR reaction was working in the absence of a 
FRA16D region PCR product (Figure 4). Suitable PCR primers for Alu29, 17Sp6, 
Alu20, 178poly, 5.1A6, RD69, IM7 were used or for 504CA, forward 5'- 
AACACAGCTCTTATCACATCC- 3\ reverse 5'-TGGCTGTAmGTCAGAACTG- 3*; 
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while others were as given in database accessions, D16S518 (GenBank Z24645), 
Afma336yg9 (GDB 1222843), WI2755 (GenBank G03520), STSG-10102 (GenBank 
Z23147), D16S3029 (GDB 605884), WI-17074 (G22903), IM9 (GenBank R05832), 
D16S3096 (GenBank ), D16S516 (GDB 2(X)080). PCRs for GenBank AA368108 
5 (forward 5'-TAATCCTCAGCCTCTAGAATGCCT-3\ reverse 5'- 

GTATGATGATTTTCAGGGAGAAAC-3') and GenBank AA398024 (forward 5'- 
TGTCCTCAACTGATTCTrACAAAC-3, reverse 

5 -TCAATGGGTTAGGCACAGACC- 3') were derived from partial sequence 
analysis of B AC353B 15. Control PCRs for FRA3B deletions were D3S 1 234 (GDB 
10 186387), D3S 1300 (GDB 188420) and D3S 1841 (GDB 254090). 

Results 

Positional cloning of FRAI6D 

A contig of YAC clones was established in the 16q23.2 region between markers 
15 c306D2 and c307A12 which were found by FISH to map proximal and distal to 

FRAJ6D, respectively (Figure IB). The individual YACs from this contig were also 
used as hybridisation probes to further localise the fragile site. These experiments 
identified the YAC 801B6 as spanning FRA16D, and therefore this YAC was used as a 
source of DN A for subcloning the region to provide shorter DNA fragments for further 
20 refinement of the fragile site position. In addition, BAC clones were identified from 
the region to provide redundancy of cloned human DNA in an effort to avoid potential 
problems of instability of human DNA in YACs, as has previously been noted for other 
fragile site regions, including FRAXA (37), FRAIOB (38 and O. Handt, pers. comm.) 
and a Chinese hamster aphidicolin inducible fragile site region (39). 

25 

A pulsed-field gel restriction map of YAC 801B6 was constructed by using HincII 
restriction fragment subclones of the YAC for use as hybridisation probes (H13m, 
H22s, H23m, H29m and H40m) (Figure 2A). The position of the BACs (379C2, 
325M3 and 353B 15) with respect to the YAC restriction map was determined by both 
30 the restriction mapping of the BACs and die positioning of corajnon markers by PCR 
or hybridisation (Figure 2A). The STS (D16S518, Afma336yg9, WI2755, 
STSG-10102 and D16S3029) content of the YACs and BACs was also determined to 
assist in map construction. 

35 Subclone libraries of DNA from YAC 801B6 and BAC 325M3 were generated using 
the lambda vectors A.GEM12 and X,GEM1 1 (Promega), respectively and assembled into 
a contig by end-fragment hybridisation and restriction mapping. The integrity of the 
YAC restriction map was verified by comparison with that of the BACs, 325M3 and 
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353B 15. For the region between the BACs the integrity was verified by the use of long 
range PGR using human chromosomal DNA as template, (data not shown). 

Localisation of FRA16D by fluorescence in situ hybridisation (FISH) 
5 There have been difficulties in determining the precise localisation of common 

chromosomal fragile sites using FISH (refs FRA3B (13, 40,41,42), FRA7G (18,19) 
and FRA7H (43). The FISH data have been interpreted as due to the fragile sites being 
spread out over long DNA sequences (eg lOO's of kb) or that there are multiple fragile 
sites at a single locus. An alternative explanation is that the DNA in the immediate 

10 vicinity of the fragile site is not tighUy 'packaged' into chromatin. We therefore chose 
to score only those chromosonies where the width of the gap or break at the FRA16D 
fragile site was greater than that of one chromatid (Figure 3). This approach was 
intended to reduce the possibility that the 'unpackaged fragile site DNA' might be 
looping back over the distant side of the fragile site and therefore give a false 

15 'spanning' signal - particularly for probes that are very close to or within the fragile site 
region. In addition, while the use of pre-reassociation in the hybridisation process 
dramatically improved the signal to noise ratio, it did render repeat rich regions poor 
hybridisation probes. This was particularly evident in the FRA16D region where there 
is an abundance of DNA repeat sequences of various kinds. 

20 

The results of the FISH experiments are plotted in figure 4. The closest clearly 
proximal probe to FRA16D is A-1-44 while the closest unequivocally distal probe is 

These probes map at a distance of "-200kb apart. However, this 200kb region 
includes consistent scatter of distal signal around X,l-38 and Xl-ll and the poor 
25 hybridisation between ^181 and X51 1 (due to repetitive DNA content). Therefore this 
200kb defined by FISH analysis is likely to be the maximum sequence required to 
define FRA16D rather than provide any evidence that the fragile site is spread over 
such a distance. 

30 Detection of homozygous deletion in tumour cell lines 

The FRA3B fragile site - FHIT gene intron 4 region is a frequent site of deletion in 
various types of cancer (8). Homozygous FRA3B deletions have been detected in 
various human adenocarcinoma cell lines including (gastric) AGS, Kato 111; (breast) 
MDA- MB436; (colon) LoVo, HT29, SW480 and LS180 (8). Since these deletions are 

35 somatic events that presumably occur as a result of exposure of these cells to certain 

environmental factors (1 1), we chose to analyse tumour cell lines which exhibit FRA3B 
deletions for the presence of homozygous deletion at the FRA16D locus. 
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STSs that were either mapped to the FRA16D region (Figure 1) or generated from 
partial sequence analysis through the region (data not shown) were used to screen for 
homozygous deletion in various tumour cell line DNAs. The STSs were duplexed with 
a PGR from the dystrophin locus, as an internal control. The results for the analysis of 
5 one of the FRA16D region markers, STSG- 10102 is shown in figure 4. Of the seven 
tumour cell lines tested, the stomach tumour cell line AGS was found to be 
homozygously deleted at STSG- 10102 and a series of contiguous markers through the 
region, (Table 1) thus suggesting the presence of minimal deletions spanning the 
FRA16D region in each chromosome 16 present in the AGS cell line. 

10 

Detection of heterozygous deletion in AGS tumour cell line DNA 
The maximal extent of heterozygous deletion in the AGS tumour cell line in the 
FRA16D region was determined by genotyping polymorphic markers. TTie markers 
D16S518 and D16S3029 both gave two alleles indicating proximal and distal outer 
15 limits to the deletion of either chromosome 16 in AGS cells (Figure 2A). The markers 
Afma336yg9 and 504CA were uninformative and therefore did not aid in delineating 
the limits of heterozygous deletion. 

Discussion 

20 The region in which the chromosomal fragile site FRA16D is located has recently been 
shown to be associated with two types of chromosomal instability in cancer. In 
multiple myeloma, translocation of Ig loci into the 16q23 region causes the 
dysregulation of the c-MAF proto-oncogene on the affected allele. While these 
breakpoints are spread over at least 500kb they bracket both the c-AMFgene and the 

25 FRA16D fragile site (1 and figure 1). The dysregulated expression results in elevated 
c-MAF mRNA levels, which is thought to contribute to neoplasia. These translocations 
were not identifieJby conventional cytogenetic analysis. Their detected frequency in 
multiple myeloma cell lines suggests an incidence of -25%. 

30 Using representational difference analysis to identify differences between the genomes 
of normal and tumour cells, the FRA16D region has also been shown to be the site of 
homozygous deletion in three different types (lung, ovary and colon) of 
adenocarcinoma (29). The commonly deleted region includes FRA16D, with the 
minimal deletion in colon tumour cell line corresponding almost exactly to the -200kb 

35 region shown by our HSH studies to span the FRA16D fragile site If common 
aphidicolin fragile sites confer susceptibility to mutagen induced DNA instability in 
cancer then tumour cell lines which have been shown to have such instability at one 
fragile site are likely to exhibit instability at another fragile site. By analysing tumour 
cell lines with known FRA3B deletions, we have found that the AGS cell line derived 
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from a stomach cancer exhibits homozygous deletion spanning FRA16D. 
Heterozygosity of the flanking markers D16S518 and D16S3029 indicates that the 
chromosome 16 deletions are confined to the inrmediate vicinity of FRA16D, 

5 Taken together these deletion data confirm the hypothesis that FRAJ6D is associated 
with specific chromosomal instability in cancer. 

Given that the observed deletions are homozygous they are therefore likely to represent 
the loss of a negative function (eg tumour suppressor) rather than the gain of a tumour 

10 promoting function. If the analogy with the FRA3B locus holds then a gene either 
spanning or, at least partially, within the FRAI6D commonly deleted region may 
contribute to neoplasia as a consequence of quantitative and/or qualitative effects of the 
deletion. Alternatively, the proximity of the FRA16D deletions to the c-MAF gene 
suggests that they have the potential to affect c-MAF expression. The FRA3B fragile 

15 site is associated with a region of late' replication (48) as are the 'rare' fragile sites 
FRAXA and FRAXE (49,50). Assuming that replication timing is affected by 
proximity to fragile site loci and, given the coupling of replication with transcription, 
the deletion of the FRA16D region may lead to an alteration in the timing, with respect 
to the cell cycle, of the expression of genes in the area - including c-MAF, 

20 

ABBREVIATIONS BAG, bacterial artificial chromosome; DAPI, 4\6-diamindino-2- 
phenylindole; FISH, fluorescence in situ hybridisation; FITG, fluorescein 
isothiocyanate; LOH, loss of heterozygosity; FHIT, fragile histidine triad; ERA, fragile 
site locus; PGR, polymerase chain reaction; STS, sequenced tagged site; YAC, yeast 
25 artificial chromosome 

EXAMPLE 2 - DNA SEQUENGDSfG OF THE FRA16D FRAGILE SITE AND THE 
FOR16DGENE. 

30 Materials and Methods 

Large scale sequencing of FRA16D included 

a) Sonication libraries and 

b) Nebulization libraries of BAG clones 325M3 and 353B 15 and 

c) Restriction fragments of Lambda clones 

35 (for sequencing between B AG325M3 and B AG353B 1 5) 

a) Gonstruction of sonication libraries: 

lO^ig of each BAG DNA were sonicated for 20 seconds using the Ultrasonic Inc. Heat 
Systems Sonicator (50% duty, 3.5 power). 
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Blunt ends were created with 40 U of Mung Bean Nucleases at 30 °C for 25 minutes. 
The products were size fractioned on an 1% Agarose gel and fragments ranging from 
1.9-0.8 kb were extracted from the gel with the Qiaquick Gel Extraction Kit. 
1500 ng of sonicated DNA were ligated into pUC-Sma plasmid vector and cloned into 
Sure cells (electroporation-competent, Stratagene). 

600/1500 clones of the sonication libraries of BAG 325M3/353B15 respectively were 
gridded on 96 well plates and sequenced in one direction using the M13-forward 
primer. 

Sequences were assembled into contigs in the gap4-program on an UNIX computer. 
For a selected number of clones sequences with the M13-reverse primer were also 
retrieved and assembled. Restriction maps of the contigs were compared to physical 
mapping data. Rearranging and editing of the sequence was undertaken with the 
"LaserGene" computer program. 

Numerous primers were designed and PCR-products sequenced to close gaps between 
contigs. 

b) Construction of nebulization libraries: 

10 ng of each BAG DNA were nebulized at lOpsi for 45 seconds. 
Size-fractioning and cloning was done as described above. 

300/500 clones of BAG 325M3/353B15 respectively were sequenced as described 
above and included in the assemblies. 

Subclones for sequencing of BAG353B15 were picked randomly, whereas 
B AG325M3 subclones were selected after specific hybridisation experiments. 

c) Subcloning of restriction fragments of selected X-clones was done in pUG 19- vector. 
Glones were sequenced with M13-forward+reverse primers as well as with specific 
primers. 

The Nucleic Acid Sequence (FOR16D ) 

The inventors have prepared a DNA sequence for the FRA16D fragile site and the 
minimal overiapping region of homozygous deletion in adenocarcinomas of the lung, 
colon, stomach and ovary and in doing so have discovered a gene located at 
chromosome I6q23.2 and determined its DNA sequence. 

An overview of the sequence data can be seen in Figures 6 and 7. An approximate 
restriction enzyme map is shown in figure 7c was prepared for the YAG My801B6. 
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Sequence for FRA16D was obtained primarily from BAG 353B15. The DNA 
sequence of FRA16D is presented in figure 16. This is approximately 270kb long and 
is bounded at both ends by an exon (termed exon 4 and exon 5 respectively). 

5 The exons were compared with ESTs in the GENB ANK data base and two EST 

clusters were identified. These are indicated as I and II respectively in Figure 7. Both 
of these are splice variants of the one gene. Further sequence data for contigs #208 and 
contig #779 was obtained from BAG 325M3 to identify two further exons 2 and 3 
respectively. The DNA sequence of contigs #208 and contig #779 are presented in 

10 figures 15 and 16 respectively. Homologies for the unlocalised DNA was searched for 
again in the same database and this identified a further EST cluster, termed ESTm 
which. The EST's with homologies are again listed in figure 7. DNA sequence 
information of the BAG 009280 identified a further exon which was termed exon z. 
The remainder of the unlocalised portion of the EST cluster was termed exon a. A 

15 further exon was identified on searching through EST databases for homologies with 
exon a to identify a yet further EST cluster ESTIV which is a combination of exons a 
and w. 

The sequence defining the FRA16D site is flanked by two exons of the FOR16DI gene 
20 with no other detected transcript within this intron. In addition, the breakpoints of three 
out of five 16q23.2 translocations associated with multiple myeloma (Chesi et al 1998) 
map within the alternate splice of this FRA16D intron, that is between exons 4 and x. 

DNA sequence for each of the exons was compiled by a comparison of the EST 
25 clusters against each other as well as against chromosomal sequence. These are set out 
in Figure 15. 

The position of exons a, 1, z, w 6 and x have only been approximately mapped on the 
basis of their presence on certain subclones (containing localised markers within 
30 1 6q23) as judged by hybridisation experiments. 

Composite DNA transcript sequences have been prepared of the EST clusters and 
putative DNA sequences for four variants of the gene FOR16D (I, H, in and IV) have 
been compiled and are presented respectively in figures 9, 10, 1 1, and 12. 

35 

The predicted amino acid sequence are presented in figures 13A to D. These amino 
acid sequences where compared with amino acid sequences stored on the Swissprot 
amino acid sequence database using the program BLAST (Altschul et al (1997) Nucleic 
Acids Res. 25:3389-3402). Two significant group of homologies were found. A first 
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of the homologies is identified in FOR16DIV relative to a protein related to DNA 
replication (HumanPeptidyl-prolyl cis tran isomerase and three other proteins. As can 
be seen the string LPPGWEERT appears highly conserved. This string lies within 
exon A and implicates that exon as having a role in DNA replication or Chromosomal 
5 division. This is compatible with the FOR16D being associated with tumourigenesis in 
that one group of proteins having an association with cancer fall into this group. This 
amino acid string and the DNA sequence encoding may also be very useful for 
identifying other cancer associated genes. 

10 Another group of homologies that were found are further downstream for the FOR16DI 
gene these provide or several relatively strong homologies in several different amino 
acid strings for some oxidoreductase genes. These strings include VVVVTGANSGIG, 
MTLDLALLRS VQ, PLDVLVCNAA and VNHLGHFYL. There is a potential that 
FOR16D has an oxidoreductase activity. Association of oxidoreductase activity with a 

15 protein associated with DNA replication or chromosomal division has to date not been 
published. 

The RNA Transcript 

The estimated size of the major alternatively spliced transcript, as determined by 
20 Northern blotting (Figure 8) using a portion of exon 3 as probe, is about 2400 
nucleotides and most likely corresponds to the transcript of FOR16DII, there is a 
smaller transcript which is about 1.6kb and this is most likely to correspond to 
transcript L A similar experiment has been conducted where the probe is selected from 
exon X and the 2.4 kb transcript is seen again, supporting the view that the 2.4kb 
25 transcript is the FOR 1 6DI transcript. 

For the purposes oT working the invention a large number of references to pertinent 
methodologies are set forth in the following US patent documents:- US 5981218 to Rio 
et al, US 5928884 to Croce et aU US 5945522 to Cohen et al, and US 5837492 to 
30 Tavtigian et al. These documents are incorporated herein entirely specifically for 
purposes of permitting working of the invention. 

For the purposes of this specification the word "comprising" means "including but not 
limited to", and the word "comprises" has a corresponding meaning. 

35 
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Figure 4: 



Fluorescence in situ hybridisation of lambda subclone contig to FRAI6D 
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Figure 5 : Duplex PCR deletion detection at the FRAUD locus in tumour cell lines 
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Figure 7 ;Map of fOmeOaltemate transcript swithrespecttoF/7476D_and common homozygous deletion 
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FIGURE 9 



Composite FOR16DI transcript 

tj96c08.x 1 , wb85d 1 1 .x 1 , HHCM A56, qg88«)4 

GCGTAQGGGaacc aaatacctccacaatcaQccATGacaacqctacactacGcaaaQcta 
OLa cgacacqqacaat qaqqacqaqctqctccqqQQC tqqqaqqaqaqaacc acaaaqQac 
ggctgggtttactacqccaatcac accqaqQaqaaqact caqtqqqaacatccaaaaact 
qqaaaaaqaaaacqaqtqqcaqqa qatttqccatacqqa tqqqaacaaqaaactqatqag 
aacqqac aaqtqtt ttttqttqaccatataaataaaaqaaccacctacttc Qacccaaqa 
ctqqcqtttactqtqqatqata atccqaccaaqcca accacccq qcaaaq acacqacqgc 
aqcaccac tqccatqqaaattctccaqqqc cqqqatttcactqqcaaaqtccttqtqqtc 
actqqaq ctaattc aqqaataqqqttcqa aaccqcc aaqtcttt tqccctccatqqtqca 
catqtqa tcttqqcctqcaqq aacatqqc aaqqqcq aqtqaaqc aqtqtcacqcatLtta 
gaaqaatgg at:cttgg:cctgcaggaaca 

ttagaagaatggcatagggccaaggtagaaacaatgaccctggacctccczccg^ 

agcgtgcagcattntgctgaagcattcaaggccaagaacgtgcctctcca-gtgctcgeg 

tgcaacgcagcaacttttgctctacccggagtctcacaaagatggccnGcagacaccttc 

caagtgaatcatctggggcacttctaccttgtccagctcctccagga-gc-ztgngccGc 

tcagc tec tgcccgtg teat tgtggtctcctcagagtcccatcgatttacagatattaac 

gactccttgggaaaactggacttcagtcgcctctctccaacaaaaaacgac-attcggcG 

atgctggcttataacaggtccaagctctgcaacatcctcttctccaacgacrcgcaccgt 

cgcctctccccacgcggggtcacgtcgaacgcagtgcatcctggaaa-a-gatgtactcc 

aacattcatcgcagctggtgggtgtacacactgctgtttaccttggcgacgcctutcacc 

aagtccatggtttcagactgcctggtagaaggaggtcacttctgattgtcag-gactttg 

agctgagtgctgaaataaaatgataaacaagtcaaaaa 



Figure lO 



Composite FOR16DII transcript 

tj96c08.xK wbSSdll.xl , HHCMA56 

GCGraQ^OGaaccaaQtqcctccacaqticaaccATGQcaqca ctQcactacacqqqQCtQ 

qacqacacqqacaqtqaqqacqaqctiqctc cqqqqctqqqaqqa qaqaaccacaaaqqac 

qqc tqqq 1 1 tac t acqccaa t cacaccq aqqaqaaq ac t caq tqqqaaca tccaaaaac t: 

qqaaaaa qaaaac qaqtqqca qqaqatt tqccatac qqatqq gaacaaqaa actqatqaq 

aacqqac aaqtqtt ttttqttqaccatataaataaaaqaaccacctacttcqacccaaqa 

ctqqcq tttactq tqqatqataatccqaccaaqccaaccac ccqqcaaaqatacqacqqc 

aqcaccactqccatqqaaattctccaqqqccqqqatttcactqqcaaaqtcattqtqqtc 

actqqaqctaattcaqqaata qqqttcq aaaccqcc aaqtct tttqccct ccatqqtqca 

catqtqatcttqqcctqcaqqaacatqqcaaqqq cqaqtq aaqcaqtq tcacqcatttta 

qaaqaatqq atcttgacctgcaggaacatgg^ 

ttagaagaatagcataaagccaaga 

accg t gcagcat 1 1 tgc t caagca t tcaaggccaagaa tg- gc c t c t zcaz^czcct zazg 

tGcaacgcagcaacttttcc 

1 1: t c aag t g a a t c a t c t 

Gccgct cage tec cccccgtg tea 

t caacgac tec utcgga aa^ 

gcgcgatactggcttataacaggtc^ 

accgccgcctctcccacgcggg 

gcactccaacactcat^ 

tcccaccaagtccatgcaaca 

ac tggagggtctaggagggatgta^^^ 

agctcagagcgaaaagacggccc 

cgcttggcagccagtccggctaagtgg 

gtgtgtgtcccctcacgcaag 

atccgcaagagtaaaggaaat 

tgggaagcagggaattcctsgggtaaagt 

tctct.ttgctttct;^ 

cgtatctccctggagaagc^ 

ggtcccctcgtccatccagct^^^^ 

cctacttagggaagaaaaa^ 

attgtttcattcatcctga^ 

tcagaaccttgtcccagccagt^ 

gaactaccaggtggcaaagtac 

c t t t agag a t t a t aa c^ 



Figure 11 

Composite F0R16DIII transcript 

ti96c08.xl . wb85dll-xl 

GCGTAQGGGaaccagqtacctccacaQticaaccATGQcaacactacactacccaaaacta 
gacgacacggaGagtgaggacgagctgctccggggctgggaggagagaaccacaaaggac 
ggctgggtttactacgccaatcacaccgaggagaagactcagtgggaacanccaaaaact 
ggaaaaagaaaacgagtggcaggagatttgccatacggatgggaacaagaaactgatgag 
aacggacaag tg 1 1 1 1 1 tg t tgacca ta t aaa taaaagaaccacc tact tcgacccaaga 
ctggcgtttactgtggatgataatccgaccaagccaaccacccggcaaagatacgacggc 
agcaccactgccatggaaattctccagggccgggatttcactggcaaagtcgttgtggtc 
actggagctaattcaggaatagggttcgaaaccgccaagtcttttgccctccatggtgca 
catgtgatcttggcctgcaggaacatggcaagggcgagtgaagcagtctcacgcatttta 
gaagaatggaaaacaaaataccaccctccgccagaaaagtgcagaat aaaaa 1 1 1 tcccc 
tagcaaaagaaggaaaaaataaaagatcttgaatagtttcatcaaaaaaaaaaaaaaaa 



FIGURE 12 



Composite F0R16D transcript IV 
[tj96c08,xl, tm78c03.xl, qi38gl2,xl] 

GCGTAGGGGggccaggtgcctccacagtcagccATGgcagcgctgcgctacgcggggctg 
gacgacacggacagtgaggacgagctgcctccgggctgggaggagagaacaccaaggacg 
gctgggtttactacgccaagtaagggggccgcagtggggccgcggacgcacctgggaccc 
tgcacagcccacggacgccacctgcgcggggaggacgcgcactccagcgcagcgcgtgcg 
gtgcaaagtgaaagtaactgttaaggagcttcagggaaaagggtccagggutGCcagtag 
gggccggcccccttggtgggcctcgggtccagcgggggtcacctggtgccctcccggcgc 
gccctctgctgttcaggatgcagcactgcgcggcgcggcgagggcaaagccgcctcatcc 
ccgccaaaaaataaagatgttttaaaaagcgcaaaa 



FIGURE 13 A f 

/ A 

amino acid sequence oi ^Or{6DI 

[ qi38gl2,xl , wb85dll.xj atid HHCMA56 and qg88f04] 
f MAALRYAGLD 1 [ DTDSEDEiJ C ^WEERTTKDGWVYYANHTEEKTQVjTKPKTGKRXRVA 
DLPYGWEQETDENGOVFFVDHINKRTTYLDPRl^FTVDDN PTKPTTR QRyiXSSTTA^^ 
QGRDFTGKWWTGANSGIGFETAKSFALHGAHV] . [I LACPJmMS.5.rvV£^^ 

/RAKVETMTLDLALLRSVQHXAEAFKAKWPLHVLVCNAATFALPGVSQRWPGDTFQV^ 

IX^HFYLVQLLQDVLCRSAPARVIWSSESHRFTDINDSL/^KLDFSRLSPTK^^ 

NRSKLCNILFSNELHRRLSPRGVTSNAVHPGNMMYSNIHRSWWVYT 

SDCLVEGGHF 

FIGURE 13 B 

amino acid sequence of FORI dDII 

[ ai38<jl2.xl wb85dll.xl overlap and HHCMA56] 

f^^AALRYAGLD1J ^DSEDELLRGWEERTTKIX^WVYYA^ mTEEKT0W TH^ 
DLPYGWEQETDENGQ VFFVDHII^RTTYLDPRLAFTVDDNPTKPTT RQ?/fDGSTTAI-IEIL 
QGRDFTGKWVVTGANS GIGFETAK SFALHGAHVj_ I LACRM4ARASEAVSRI LEeVHKAK 

VEAMTLDLALLRSVQHFAEAFKA1<IJVPLHVLVCNAATFALPWSLTKIX-LE^^^ 

FYLVQLLPGMFCAAQLLPVSLWSPQSPIDLQILTTPWENWTSVASLQQ:<TTIGRa-JLITG 

PSSATSSSPTSCTVASPTRGm/ERSDRSWKYDVLQHSSQLVGVHTAV-fLGEAFHQVKATG 

SCHHRVLCCCPRTGGSRRDVLQQLLPLHALTRSSERRDGPDPVGLSEPI.IQERLAASPAK 

WSSERMGTHTRPVCVPSRKCQAGPLPNVPPTQIRKSKGNKSIHNRVKNLK-/^^^ 

KVSLFWGWAKHRSLCFLWACLKVKTWLACRFRISLEKHQQFSSFYC'/?.IA 

FIGURE 13 C 

amino acid sequence of FOR16DIII 

[combination of qi38g^l2.xl and wh85dl1 yl] 

JilMLEYAGLEaDTDSEDEL.W^ 

LPYGWEQETDENGQWFVPHINTO^TTYLDPRLAFTy^ 

GRDFTGlfWVVTGANSGIGFETAKSFALHGAHVIIACRNMARASEAVS 

PPEKCRIKIFP . ^^'^^i^Vb 

FIGURE 13 D 

complete amino acid sequence of FORI 6DIV 

MAALRYAGLDDTDSEDELPPGWEER-^PRTAGFTTPSKGAAVGPRTHLGPCTAH 
GRHLRGEDAHSSAARAVQSESNC iffc 



Figure 14 



Amino acid homology sequence motifs 

F0R16DIV has a conserved motif possibly connected to DNA replication 
sp Q 13526 PINI Human Peptidyl-prolyl cis trans isomerase NIMA inieractin 

15 EDELPPGWEERTPRTAG 31 

4 EEKLPPGWEKRMSRSSG 20 

sp P46935 NED4 Mouse NEDD 4 

12 TDSED--ELPPGWEERT 26 

524 TDSNDLGELPPGWEERT 540 

sp P46934 NED4 human NEDD-4 Protein (KIAA0093) 
9 LDDTDSEDELPPGWEERT 26 
524 LDTSNDLGPLPPGWEERT 540 

sp P54353 DOD DROME DODO Protein. 

16 DELPPGWEERTPRTAGFT 33 

5 EQLPDGWEKRTSRSTGMS 22 



FOR16DI has homologies with amino acid sequences of oxidoreductase enzymes 



sp P13653 PGR Horvu Protocholoqphyllide reductase precusor 

126 WWTGANSGIGFETAKSFALHGA-HVILACRNMARASEA 
:::+:::+::+: : : + : : ::++::: + : : + : 
7 6 VWITGASSGLGLAAAKALAETGKWHWMACRDFLKASKA 



201 MTLDLALLRSVQHXAEAFKAKIWPLHVLVCNAA 
: :::: : ::+ +::+ +:: ::::::: 
129 MHLDLASLDSVRQFVDAFRRAEMPLDVLVCNAA 



251 VNHLGHFYLVQLLQDVLCRS - - APARVI WSSES 
::::::::+::+:+; : +++ : ; + 

183 VNHLGHFLLARLLMEDLQKSDYPSRRMVIVGSIT 



sp p35320 OxiR STRLI Probably oxidoreductase 



127 WTGANSGIGFETAKSFALHGAHVILACRN 
: : : : : : + : : : ++ : : : : + : : + 
2 WTGG ASG LG AETVRAL AAAG AEVT I ATRH 



203 LDLALLRSVQHXAEAFKAKNVPLHVLVCNAATFALP 
:::++:: ::++ ::+:::: ::: 

62 LDLS DVAS VD S F ARAWRG P LD I L VAN AG IMAL P 



2 52 NHLGHFYLVQLLQDVLCRSAPARVI WSSESHRFT 
: + : : '+ ::++::+++: : 

111 NYLGHFALATGLHAALRDAGSARI WVSSGAHLGT 



FIGURE 15 



DNA sequence of exons 
Exon A 

ggccaggtgcctccacagtcagccatggcagcgctgcgctacgcggggctggacgacacg 
gacacggacagtgaggacgagetgctccggggctgggaggagagaaccacaaaggacggc 
tggg 1 1 tac tacgccaa t cacaccgaggagaagac tcag tgggaaca tccaaaaac tgga 
aaaagaaaacgagtggcaggagatttgccatacggatgggaacaagaaactgatgagaac 
ggacaagtgttttttgttgaccatataaataaaagaaccacctacttcgacccaagactg 
gcgtttactgtggatgataatccgaccaagccaaccacccggcaaagatacgacggcagc 
accactgccatggaaattctccagggccgggatttcactggcaaagtccctgtggtcact 
ggagc t aa t t cagga 

Exon 1 

atagggttcgaaaccgccaagtcttttgccctccatggtgcacatgtcazcttggcctgc 
aggaacatggcaagggcgagtgaagcagtgtcacgcattttagaagaacgg 

Exon Z 

aaaacaaaataccaccctccgccagaaaagtgcagaataaaaattttcccccagcaaaag 
aaggaaaaaataaaagatcttgaatagtttcatcaaaaaaaaaaaaaaaa 

Exon W 

gtaagggggccgcagtggggccgcggacgcacctgggaccctgcacaccccacggacgcc 
acctgcgcggggaggacgcgcactccagcgcagcgcgtgcggtgcaaagcgaaagtaact 
gttaaggagcttcagggaaaagggtccagggttcccagtaggggccgccccccttggtgg 
gcctcgggtccagcgggggtcacctggtggcttcccggcgcgccctctcctgttcaggat 
gcagcactgcgcggcgcggcgagggcaaagcggcctcatccccgccaaaaaataaagatg 
1 1 1 1 aaaaagcgcaaaa 

Exon 2 

cataaagccaaggtagaagcaatgaccctggacctcgctctgctccgtaccgtgcagcat 
1 1 tgc tgaagca t tcaaggccaagaa tg 



Exon 3 

gcctcttcatgtgcttgtgtgcaacgcagcaacttttgctctacccggagtctcacaaag 
atggcctggagacaccttccaagtgaatcatctggggcacttctaccttgtccagctcct 
ccaggatgttttgtgccgctcagctcctgcccgtgtcattgtggtctcctcagagtccca 
teg 

Exon 4, 

atttacagatattaacgactccttgggaaaactggacttcagtcgcctctctccaacaaa 
aaacgactattgggcgatgctggcttataacaggtccaagctctgcaacatcctcttctc 
caacgagctgcaccgtcgcctctccccacgcggggtcacgtcgaacgcagtgcatcctgg 
aaatatgatgtactccaacattcatcgcagctggtgggtgtacacactgctgtttacctt 
ggcgaggcctttcaccaagtccatg 

Exon 5, 

gtttcagactgcctggtagaaggaggtcacttctgattgtcagtgactttg 
Exon 6 

agctgagtgctgaaataaaatgataaacaagtcaaaaa 
Exon X 

caacagggagctgccaccaccgtgtactgtgctgctgtcccagaactggagggtctagga 
ggga tg tac t tcaacaac tgc tgccgc tgca tgccc tcaccagaagc t cagagcgaagag 
acggcccggaccctgtgggcctcagcgagaggctgatccaagaacgcttggcagccagtc 
cggctaagtggagctcagagcggatgggcacacacacccgccctgtgtgtgtcccctcac 
gcaagtgccaggctgggccccttccaaatgtccctccaacacagatccgcaagagtaaag 
gaaataagagcattcacaacagagtgaaaaatcttaagtaccaatgggaagcagggaatt 
cctggggtaaagtatcacttttctggggctgggctaggcataggtctctttgctttctgg 
tggtggcctgtttgaaagtaaaaacctggttggcgtgtaggttccgtatctccctggaga 
agcaccagcaattctcttccttttactgttatagaatagcctgaggtcccctcgtccatc 
cage taccaceaceaccaeeae tgcageeagggge tggcc t tc t ce tac 1 1 agggaagaa 
aaagcaagtgttcactgctccttgctgeattgatecaggagataattgtttcatteatec 
t gaccaagac tgagccagc 1 1 agcaac t gc t ggggagacaaa t c t cagaacc t tg t ccca 
gccagtgaggatgacagtgacacccagagggagtagaatacgcagaactaccaggtggca 



aagtacttgtcatagactcctttgctaatgctatacaaaaaattctttacagattataac 
aaatttttcaaatcattccttagatacc 



FIGURE 16 



BAG #2 - finished sequence (270kb) 

5 GATGGGCGTTTATTATGAGATACAOSAAGACAGAGACCAGAGGTTC^^ 
ACCTGGTAQCTTGTTACACATAAAAATTCTTGGGCTTTAT^ 
CAGTCriGAGGlTTATTTTTTCGAAAACTGTAAGT^^ 
ATCraXACGGTCGTCATGCTGCAAAGATC^ 
TGCTCTACCCACCACATGCAAGCCCCAGTGTGT^^ 
10 CATTTATAAACGAGAGCATACAGTGTTTGGTTITCAGT^^ 
CATGTCCCTCXAGAGGACATGATCTCriTI^^ 
CATCCTATCATTTATGAGCATTTCGAGTGATTCC 

CTGGGTATGTAACCAAAGGAATAGAAATGArrGTATTATAAAGGTACATGCAC^ 

AGTAGAAAAGACATGGAATICACCCAATCCGAATTTrAACA^ 
15 GGTTCTGGCTCTGaVGAACTCAGGTGGGCCCC^^ 

TGTAATCXXACCACTTOnCAGGCCGAGGCGGG^^^ 

GAAACICCATCTCTACTAAAAATATAAAAATTTGCCG^ 

GGGGCAGGAGAATCGCTTGAACCTGGGAGGCAGATGTTC^ 

CAGAGCCAGAGTCTCTCTCAAAAAACGAAAAACAAA 
20 GATGCTACTTGTCTTGGCCAATGAAATGAAAAG^ 

GAGGCTGAGGCAGGCTIGATCACCTGAAGTCAAGAATT^ 

AAATACAAAAATAACAOCAAAAAAAAAAAAAAACCACTVACAAATTAGCa^^ 

ATTIXSGGAQGCTGAGGCAGGAGAATCTCTTC^ 

TAGCCTGAACTICAGAGAGAGACTAaxSTCTC^^ 
25 GAGCAAAGGAGTCITGTGGAAGACAATAGGGAGTGTTGAGC^ 

GGAAGGACTTCTTAATTCCAGCTGCXnXST^ 

AAATTAAAAGGCTAAAAGTTTTGGCTTATAAAAATGGTATTTCCIX^ 

GTTTTTCTTTGTTTTCGAGACGAAGTTTCGC^^ 

GTTTAAGCCATTCTCCTGCCTCT^GCCIXXr^ 
30 ATTTTTrAGTACAGACGGGGTTTCAGCATCnTC^ 

GCCTCCTAAAGGGTTGGGATTACAGGTGTCA^ 

TCy^TAAATTTTTAAACACTCTGCAGTrcAAAGAAAACATC 

GTTATTAGCAAATCCCTCAATGTGTTGACCAAGAGTX^^^ 

rTTTATKXrPITATAAGTACAGTTGTAGATTTACAA^ 
35 CGCAATTGAGCATATAGTACATGCTCnXXrrcTGTACCCACTGATC^^ 

TTGCATTAGTGTGGCACATTTGCrACAArrAATAAACCAA 

GCTGCOXXriTGTGTTGTACATTCTATAGGTTTGT^^ 

CATTCCAATTTCATACAGCGTAGTTTCACGCCTCT^ 

CTCATTTCCCCAGCAAAGTCCTTCTGAG^ 
40 ATTGTGATTOTGAriTGGATTTTCTrcATGGe^ 

riCAGCGGTTATCAGGTCACACCTGTAATCATGCCCACC^ 

AAGAGCTTACAGAGATCCTTCTGGTTCAGAGGA^ 

AGGCTAAATTATAGTCXATATCACTITTCCTTTAAATC^ 

TGCAGAAAAAAGTACTGATTGTCCAGCAGTTGaxnCAGGTA 
45 TGTGGCCATGAAACXriTraCXnX3GCCC^^ 

GGTCTGCOXTGTGCAGGCAGGAATGTTCAGGAAGCAC^^ 

GCATGGATCCTGACTAAGCGGAGCAGAT^^ 

ATGTGTTTTTTGTTAGTriTTGAGACAGAGTC^ 

GCAAACCCTGCXnx:CCAGGTTCAAGCX3AT^^ 
50 GCXXAGCTAATTTTTATArrTTTACTCGAGACXn'AG^ 

TCATATGCGCCCCTCAGCTTCCTAAAGTGCT^^ 

ATCACCAT TGTCAATCTIXSAAAGG AGACTTTG 

TCAGATTGCTTTTTTTTITTTT^ 

CTCCA^CTCTCCCT^^ 
55 TTTITCTATTTOTAGTAGAGATGGCATTTTGCCA^^ 

GTCTAAGACTCCCAAAGTGCTCGGATTACAGAIX^^ 

AGCTGTGTGGGAAGTCAGAACTTCGTTGCTTCATGT^ 

CSGAAAACTGGACTICAGTCGCCTCTCTCC^ 

CAACATCCTCTTOICCAACGAGCT^ 
60 TCATGTACTCCAACATlCATCGCAGCriX^^ 

GTAAGAGAACAGCTTC^IGGCGCCGCAAACACCTT^^ 

TTOCGGGCATGAGTCTGGTCTCAGTAATAACATTGTC^ 

AGGTTAAGTCriXnTTGGGTAAATGCGTCT^^ 

AAAGTCXriTATGGAAATGGTGATTTTTTTGTTTC^ 
65 TCTA'i-rriviuGAATGGAGCACTTCAAAACrcC^^ 

TTCAGCAGATGTGATTXXrrTTGlTT^^ 

TCTGTTGGAGAGTCCTTreATAGCTAGGAGTGT^^ 
CTGTQGTTACATTTGGAAGTGTTTACAAAGGTGATAAGATX^^ 



CGGGGTAAriCTG(XCTTGAATAACCCGCIX3^^ 
GATCHTCATITXmCCTGAAATTAGTATTTTAGAAA^ 
CACTATAGATACTATCAAGGGCAAAGCCXXXrrGTTT^ 
. TGTATAAAACAGGCTTTCTCnTCXXSAGGGGGAAATAT^^ 
ACATTTTGAGACT(XTGAGTCAGGTTTG^ 
TCTCAGGACXXXnCTCTGGTTGGACCAC^ 
GrrceiCITACACCACTGAGGCCACC^^ 
TTAGTAGTCTTACGTCATTATGrCCAAAGCTTTTA^^ 
GTTGGAAAAGGGCAAAGGATTTAGAATCATGAGGGTICTAT^^ 
TGAAATCATrTGACATTTCTGAGGTTAATTTTCT 
AAGGTGGCTCAAATAATAGAGGTTTTXXriTCC 
ATGAGTAGCTCCTTTGTGATAATCTTTACTTTTC 
ATTIXSATAAGGTCTGAAGGCACAGATGGGAGGGATGAGGCTCC^ 
ATTCTGCTTCAGAACTGTCCTCXnG^ 

AGTTATAAAGGAGACTAGAAAAGTGCATAACTGGCTriTTGAa^^ 

AAGGATAGGGTAATQGTGTTGACACAGGCACTGTTGAATACCT 

GAAATQCTATCGATTTTATAGGATCGTTGTGAGAAGAAG^ 

ATCCACITGACCCACCATACCTCAAGCCACTT^^ 

GCCTAGAATACTCCTCCTOCACnCTTAATlTC 

TTTTGGTCTCnCAGATTATATGAAGCT^ 

GTAGCACTCCTrATGTGTTGAATGAGTAAATGAAGAGT^ 

TGAAAACATAAITTCTGTAGTTTTTTTTT^^ 

GTQGCACAATOCX^GGCTCACTGCAACCTC^ 

GGAITACAAGCACTGGCXACCAAACCX:^^ 

GGTCTCAAACTCCTGACCTCAGGTGATC^ 

QCCIX3GCCIX5TAGTnx:nU'lU'iX nUUTi ' lVinUUr^ 

aatgcagtggtgcaatcotggctactgcagcc^^ 
tgaaatacaggtgtcx:accaccacgcacx3gcix^ 

TCGTCTCAAA(XriXXXriCAGCCTCC^^ 
TAATGTCTGTCCXXXXriTCTCITGACAG^ 

TGTAAATGTTTOTAAATGAATACATCAAAGATGACCAGGTGCTG^ 

ATCCCTCTGAGGTAGGCAAGGGATTATAGTTTAGAATGTGTAAATACCT 

CACACAGCAAGCCCCCAAGGTCCCCCAGGCTCATG^ 

GGAGGGAATGAATGTAAGGAGAGAGAGGTGAGGCACAGGAGAGAC^ 

AGCCCTAGCTCTCTCCCATTCTAACTC^ 

AACTACCTa^TGGGCITCTCnTGAGAAlTAAAGACGTT^ 

GTTAACTACTATTGTTATraxriTGCAGTAGAAATTAAAGA 

TAACCATTATTGCTXnxrATTATTAACATTATTTK^^ 

CCAAC3CTC3GGAGGATCACriXWXXrrc(^^ 

GACATAGTGAGACCCTATCTCTAAAGGAAAAAAAAAATTATATGTATTCG^ 
ATGCATTGTGAAATGACTACCACAATCAAGCTAATTAACACAT^ 
AAGAACATTGAAAGTCrrAGrrCTCA<X:AATTI^^ 
ATGCTGTACAATAGAlXriXXAGAATCTATTCATr^^ 

atcaccccctccx:acttorccacta^^ 

taaattccacacaaaaagtgagatcatagagtatttc?^^ 

cttcccattcottatagatacxxzaatctcgtaatg^^^ 

ttcaao^taaaaacctcamtgggccaggcacagtggct^ 

aatcacctcaggttgggagttcgagaccagcctggt^ 

tggctggctggtcatggtogctcatgcctgtaat^^ 

attgaq?vccatcctagctaacatggtgaaacc^ 

cgcx3gtcgtoggtgcctctagtccx:agct;^^ 

agtgaqcctagatcxxraccactcxzact^^ 

AGGCTGrrcrGTCGrreACTCGTC 

GGAGGTTGCAGTGAGCrcACATCGCATCACl^^ 

AAAAACCCAAAACTATATGTGCAAGCCTCAGGAACXrAC^^ 

CACAGTCCAGMTTTAAGOCXnxXAf^^ 

ACrnCTTCTTAAAAACAAAGAGAGGGAAACTAGATCT^ 

CCnXXriTTATAGTGTTTTAAATCCGCC^^ 

GCerTAGTATGTOTAACAAATTATITAATGAAO^ 

TTTIXnTCIX n CTQC CCXnTGG^ 

TTTTCTTITAAGTITGTGGTGCCTTCTTT^ 

TGAiu-riuGGCTGCAACGCAGTTTAGATATTGGA^^ 

CGATGTITCK3GAATXnTTCTTCTGTG^^ 

TAAGGTAATXnTTTCCTAGCAAAAACAAAGAGGGCTGJ^^ 

TCTIX7ITTTTACACCAAGATAAAGAAATACTTTO 

TCATGGGTTTCCTTCTTTATGTCTTACT^ 

GCAAATCACTTATCCTCTCTGGTCC^^ 

TTAATAGAACAGTTTAACTTGAAAGGAAATAAAAAAAGGAGAC^^ 



ACeriviuCTATGAAATGGTTTTAGTAGGAAGTATTTTGAATGJAGA 
TTTTTCAAAACCAAAGGACrrcAAAAATC^^ 
TTGACAGAAATCACXrCTTTTXXXIAAACACATTI^^ 
ATAAAACAAAGCAAAACAACAACAAAAGAAGAACTATTGCAG^ 
5 TGTCTGCTGCCTTCTCTTACACTC^ 

'ITmTTTTTTTTTTTGAGACGGAGT^^ 
TCCGCCTCXXXSGGrTCACGCCATTCT^^ 
CTAATTTTTTGTATTTTTAGTAGAGACGGGGTrrcAC^^ 
CCCGCCTCGGCCTCCCAAAGTGC^^ 
10 CAAAAGTCrrTTTCAAGTGCTTAGGAGGAGGrrAA^ 

taaacagacatattcx:aagagtcgaatacttaggatcagtggc^^ 
gccgcacx5gtgggattgatctctaggaagtaaatg^ 
catcactttgcroxgatgactgcataagc^^ 
ttgttm:cttcagtitacatgaaaagaaggaatat^^ 
15 atogggagaggagagaggtcgattcttcctrattotc^^ 
catttgtaatctttcggtcttgagtcat^^ 
cagagagttcgttitctccncttccataa 

ACTATOGAATITGGTGAGGGGTCATATAACCATXXT^^ 
AATAGTGGAAACTGTTATTCTGTGGCCTGATC^ 

20 CAAGTTCnXXnAAAATATAAAGTCAGGGTAAACATTI^ 
AATATGCTTAATAAAGaCAGACTTGAAAGTTGGCTGC^^ 
GTTGGATCGGTGACICTATCATGATATCGGAGIX?^^ 
AAACCAGAGTATGGGGlTAGAAATGGAGGCAGAGGGACTACXlAlt^^ 
CAACTCAGTGAGCACCGTACCTGTAGTGTGTCT^ 

25 CCflCCAC?IX?IX3CAGCAGCercT^^ 

TGAACAGGCCTCAATGCXnATAGCAAGGCAAGGCTTG^ 
AACTGCGCACTGAGCTTTAGATCrCTGAGTGAGAAAf^^ 
AAGTGGTGACATCGCAAAAATCCCAGGTITrGTTGAAGT^^ 
GCTVTITCCCTTGTAGTTGAGATGGGATTTGA 

30 GGflCCTCACAGCTCTCXTOCTAATGGTGCTGAT^^ 
TCACCAGCTTTCTGAGCCTXnCTCATTACG^ 

CATCAATAATGATGGATATAGAATTTCTGa\.GTrATTATAATC^^ 

TATK^GTTnCTAAAAATGTGGCTAATAGAACTGAGGAAGTG^ 

AAAACACCACGTGTGCCAAGTGGCrrcCCATAC^^ 

35 GAGGTATGCAGTACTTCCTGCCACGATGGCTGTTACTCl^^ 
AAAATGTAATClXrKXrrcACAGAACAAGAGTCTGC^^ 
CTAAAGACTtTySGGAGAGGTTTACTGGCXiAGGT^ 
TGCACGTCTCATITCTGAGGAAGCCGTAGGGTGTGGAC^^ 
TTATATGTCACAAATGGACITTAGATAAGCATGAAGATGACGAAGAC^^ 

40 CCTCICTAGTGTATTTATATATTTAAGCGTTTO^ 

AAGTITATTXSAGGTTTTTGCCATTGAAAGTAATACGTC^ 
AATrCACAGGCATTCACTriUXJ-n-lCATAAACATTTAGGACTCTOT 
GGGTTGGTCArrTTCCTAGTAAGATAGACCAGACnxSACT^^ 
GCATTGCTTCCGTCAGGATAATGCATCAGTITCTT^ 

45 GATATAAG(XACGCAGTTGCATCCCATTrcTIX:^^ 
GATGCAATCTOXrrATCACXAGATGGCTAGTGTX^ 
CAOSGGTATTCXnx^GAGTACACATTCTGrn^^ 
CCTCCCTTCATTTCTTCTCTGTAGCT^ 
TGCTGTGACnTCCTGTGACTACGGTGATTTTAGTT;^^ 

50 CTCTTTAAGGAAmTTGTTGCAGATCAGTAAT^ 
TTAAATCTETCCCCITITACAGTTACACATT^ 

GTTTCTTTCTTCClCCXrrTCAATTTATGG^ 
55 TTCTAAAAAAGATCCAATATACTTCAAATAGCTQCT^^ 
AATAATGAAGTCACATX::ATACGl>XATrACATGT^ 
TAAAATCnTTTC3CA<Xri<X7VC^ 

TAGTAOCTATATTTATATATACATATTAGTCrrATGGGrrTTAATr^^ 
<^TTTTCAAGGCTAATCTTGGGTTTTTGC^ 

60 GATGCAGATTGACCATTTICCCAGATACCACXSAT^^ 

TTAfiAAOTAATAATTCCAAGTTCTACAATCrTAGAATT^ 
AGCGTGTTCAAGAAAATGATATACCAGGCITGTAAACnTI^ 
GGTAATGCCCCTGGCATCCATATCTAGrrATXX:AGTTATGCT^ 
AATtXnCCCACCACCGCCAACCCCACCl'l'r^ 

65 TCCXAGGTCCAAGCXGGTCCTTTGTGGAC^ 

TGTXX:AACCACATATnxnX?ITCGCAGATAACAGTAGAAA<^ 
TTAGGGAACTGCATTTTCAAATCATTTTACATT^^ 
TTCCTGTGCCCGTGTAAAGCCAGTCATCGTGC^^ 
TGCTX3G10riXOTXXAAAGTTATAAATTATAACTAG/W^AAG^ 



GGATGAAAACTGAACCTACXATCATTTATTAAG^ 

GGATGAGGAGCAATlXriTrTTTTm^ 

TCGGCTCACTQCAAACCTCTOCCTCC^ 

CGTGCCACCACACCCGGCTAATTTTTTGTAT^^ 

OrmACCTICATGATCCACCXrGCXnx^^ 

AGCAGTKTITTCAATGGAATATATTTGA^^ 

AAGTCATGGGGTTGTACTTGGAATGACIX3GCX^ 

GTTTCGGATCCCAGGTTCCAAACCTGCACTA^ 

ATCTGTGAAATGCAGACCITIXriXXa^^ 

TAAGCTGTXXTTAGAGTCrrGGATTCriTT^ 

TCTATTTCCACAAAGGATTTGGGGCATTAGGGAGGTAAAC^^ 

TTGATGGGTCGGCCACAAAAATCATGTCACAAGATCT^ 

CATCTAACATAAGCOSGGTGCriTQCTAGGGACAGAAT^^ 

AGaCAGATGATAGGCATGGAAATGAAGAGACAAACCACTCCGA^ . 

TITCACTAATTCATCTTCACAGAAAGAAATAATATG^ 

CAGQCTGAGGTGGGCTGGATCACTTGAGGCCGGGAGTI^^ 

AAATACAAAAATATTATCrrcOGTGTGGTGGAGCTTI^^ 

TGAACCCAGGAGGTGGAGGTTGCCATGAGCAGAGAT^^ 

TTGGGGOGGGGGGCAGGGCyySGGGAQGGGGGAAAGAAGAC^^ 

TGTCXriTTGGAAAATGGGCICTGGAATT^ 

TTATmXXrTTCrTTTAGATCnt^GAAAG^ 

GTGTGATAAAAAGGTATCAGTAGGGATCATrcATAAAACT;^^ 

TACATACATCATACTTGAGACCATGTTCCAGAGAGTAI^^ 

ATGGAAAGAGCATGCXXriTTTGACTGAGGAGGTCT^ 

GGTCAACATCCTGAAGTCTCAGGTCCTTGl^^ 

GGACTAAAATCAGGAAGGCCTGGAAAGACTTGGAAGCATTTC^^ 

TAGATTATO?IX3ATGATTTTACTACTCCrATTATT^ 

ATTACTTATGATATXmTTTCTTACTATTTGGTGAAC^ 

TTTITTTGAGGTCGAATCTCCTGTCTC^ 

AGTTCAAGCCATTCTCCAGTCTCAGCCTCCT^ 

ATTTTTAGTAGAGACAGAGTTTTGCCATGTTCAC^ 

CCrCCCAAAGTGCTGGGATGACAGGTGTGAGCCAC^^ 

CTTITCTTTCTTATACTGGAAGGTCATGATGG^ 

TTCAGAAGTAAGGTrTTTAGAAAGGTTTGAGTTATri^^ 

AATCTGGCTCAGTTCAAGTTAGCTGTGTAGCTTT^ 

TAGATCTCn7U«n>CTGTACATTTACTCCTGTGC^ 

CATTTQCTATTW^TTAGTAAAaCTGTGTAAAGCTGCTTAT^^ 

AGAAAATTGTCTTCCTTTTACAGACAAGGACACI^^ 

AAGrcACAGGGATGGAATTGAATCTGTATTTATCAGAGGT^^ 

ATCCGCATACATGCATGCCnCCCCrTAGACTC^ 

AAATGCAAAAGTGGACATTTACTCGAGAAAGAATAAAGAAACCATGCAGGAT^^ 

CTGGGGAGTCAGaXSGAGGGGTACTGGAGGAAGGCAGACTT^^ 

TATrATTATTATTTATTATTATrATCn^GTTGGAGlXirr^^ 

ctttctacaacgtcxaccicctgggttt;^ 

cactacatccagctgacttttgtattt^ 

cnx::AAGrrcATCTGOCTGaricAGCc^^ 

ATTAATAGAAGTTCTGGGACAACGGCXrTGCTCTCA^^ 
TGTGGTCXXTTCAairiXIATTTTCT^ 

GAGCAAAAAAGGCAGAAATTAATAAAACTCTATGTATTAGTTATCTAT^ 

GGTTAATACAATGCACATTCATTATTTTATAGCCI^^ 

AGAGlXnXIACAGGCTCXIATTCAAGGTCr^^ 

GGGGTCTTATXXTGaTGGGriXTICATGAAGCXiT^^ 

GCCATGGAATTGAGAAGCAGGAGCATGGCTCACAAATTCAC^^ 

ATATAGAClTTCTlXnTCGTOCTCm^^ 

TGAAATGGGCAGTGAACITCCOGGAACCAGA^ 

GGTCATTGTCAACTCTCTTGGCGCTC^^ 

TC^GTCCTTQGGTCGGCTATCTTXXrrr^ 

CCTGTTTCTTAAAGATAAGCAGTGrrATTACAGGGTAGAAAT^ 

TCAAGACTCAGCAGGGAAGGATOCACTTGC^^ 

GAAGGCTTCCTTT(XriXXXrrcGl<^^ 

GTAAGCCAAGAAGGAAATAGAATCCATTCGTAAAATGAAAC?^^ 

ACTGClTTTCCXyVTATTCTTTTTGTTAG^^ 

TATGAATCCTGAGAGGCAGGGGTGATCAGGGCCAGCTAC^^ 

TTTGGGGACCCTXOTACATIXnXSGTAGATlTAC^^ 

TATTAaSGATAAGTTTGTC n -l U ' iX/ l U CGTTCT^^ 

TCAGAGAGGCACCAAAT<XACCCA1TTAAAACATATAT^^ 

TAAGAAAGCTTATGAAAAGAATAAACAAACTGGTAGATAATGTATCCTAAAGT^^ 

TAGAAGGGGCCCTTGTTAATCGATGGCCCAAATTGl^^ 

AAAACTAGCCAGAAAGCGTCAACTTTTTTTT^^ 



TGTAGAATATCTAGA CTAT ITCT AATCC TAGGATTTC^^ 

AATAACCACXXXOTACriTCCKXriT^ 

TAAAGCXZAACCATGTTAAAAATGTATAATAAAGTATCTAATT^ 

l-rriuATGCTTXnTCACA TGQCAC CTTCn^ 

ATCTCTTGAATCAlTTACA riTITS AAAAAA^ 

GTGTTTGTTCAGCTrcATATCTTTATTTAT^^ 

ataaacagtcagagtgtcx:aagtaatctgagctagaaagtic^^ 

GATCCAAAAAAAATACIX?ICTATCAACTCCT^^ 

GAAAACXrreTAAAATTTQGGAACTTAAAClUUUUUUXSATATT^ 

TTTGATCTTAGAACATAACATTICTTGTT^^ 

AATATAAAAGAGCAAGTAATGAAATTACXATGGCTGMTTXr^^ 

ATA^AAGATOT^^ 

TTTICTlxriXXnTTTTGT^^ 

TATATTGGATTCXrTITATCCXnAAaxrC^^ 

TATnXATCTTCCGATGTCACATTATCTCATT^^ 

CCGATTTCATCCAAAGCCATTaTITTTAAAATTAGAAA^ 

CTCATGTITAGlXnCTAATGTXWVGATATGAAATGTAT^^ 

TACATlXntXIAGAACTAGAAGATAAAGATAATAAAACAAriCACAT^^ 

CCAACATGCrrcCGATCSGGGAGGAGCTGAGAAGAGC^ 

AGAAGGGGCTGTAAGAACACAGGATITTGAAGTO^ 

TGCTGGCCAGCCACATTACCTOCTTGG 

C5GGCTGTTCGTCGGAGGTGTGCAGGACACAGC^^ 

GTGTGCTACX:AGTTAGAATGAATTTTCXnX3ACATOT 

ATTGATTATCTGATCCCXAAACATATGGAGTTACATC^ 

ATCCTTCGGG^ 

ACTCTCTTTTriCAGGTCATATTAA^^ 
TTCACc-l'i\JiuCTCTGCATCITC'riUl^^^^^ 

AATGATTTATGTTGGAGTACAGGCATATCCCACAGAATCr^^ 

TTCGGACATATTTCTTQGATTGTCCGCAC^ 

CATCCKHXTITCAGATTTAGATTT^^ 

GCAGTCTGCCAGCAGACCAGATCGCAGATCmT 

ACAGTTCATCAAGGACCrAATAAATACTTTTTTCT^^ 

tgtixxxx:aggctggagtgcagtggcggatctt^^ 

CAGCCTCCCGAGTAGCTGGGACTGTAGGCATGCACTACCACAC^^ 

TACCATGTTGGCa\GGCTGGTATCGAATlX^^ 

CAGGCATGAACCACTGTGCerAGCCTTAATAAATACGTATTTAT^^ 

TCCCCTATGGCTGCTATTGTriTCCAGTGAGCT 

GCOC^^OTAAAGTACXATGTCAAGATTATGGCTC^ 

TCCTTTTTCAGCCTCXrCGAGTAGTTTGGA 

CGGGTTICATCATGTTGGCCAGGCTGCAGCTAm 

GTGGTAGCTCACTTCTGTAATCCTAGCACTT^^ 

CCTGGCCAATATGGTGAGACCCTGTCTGTACTAAAA<^^ 

CTOCTTGGGAGGCTGAGGCAaSAAAATCXnTTGAAC^ 

TTCAGCCTGGGCXXIATGAGCGAGACTATGTCTCA^^ 

AAAGACAACATCAGCTAGCGGTAGAAATGCGAATCGACGC^^ 

tatattctgcagacaggcatatgcagttagaa;^ 

ATmTATCXriGAAGArt^^ 
TTTITCATGGAAGTO^ 

TAGTTAAGCTCTTCnTTTATTGAAAGCCT^^ 

GTlTCTGTACATQ(XAAGaTTmCAGCTTT^^ 

GTA(^TTGGGCAGATCnXK^TTTCACCACA^ 

AAlATATXXXIATTAATTTGATGAAAAATCAGTATTTAaTO 

AGTATTCTAGAGTTTCICGTTTCTTGAGGAT^^ 

'I™^'I^CTTTGGAAGCCTTGA1T^^ 

GCICTAAAACAGCATCTCnCAAACITTTT^^ 

GTATA TTCGTGTGTCTGTQTGTCnxnxnATrAT^ 

CTIT TATCaxS AGCAGTGTCC^^ 

TTTCmTTTGAGACAGAATTTCACTTTT^^ 

TTOCCAGGAT CAAGCG ATTCTCXTO^^ 

TTITTTTTGTATTTTTAGCAAGGTTO^^ 

CCCAAAGTGCTGGGATTGCAGGCTTGAGC^ 

GACCCAGTTTCAATACCAACTGTTGCTTT^ 

GCAGTCACTGCCCCAATACT<nTTGATGT^^ 

TAAGGCATAGCAAATGCAAAAATGAGCAAGGTGTTCTCT^^ 

GAATGTGAAAGGGAGGTGAGATGAGCAGCACTGGGATGTGTGGC^^ 

TA AAGT TGCAGAACAGAGATGACGTCTTTAC^ 

AATTTTAGCmxnCTTTGTAGCTGTGGCTGGGOT^ 

TA^GAAGATAAATCATATTClAATCrrAATATT^ 

ATTTTGGGAGGCCAAGGTGGACGGATCACCAGAGTTC^^ 



CTACTAAAAAGACAAAAATTAGCrQGGTGTGTGGTA^^ 
AATOXrrTGAACCTGGGAGTAGGAGGCTGCAGTAAGCl^^ 
ACTCTGTIXrrcAAAAAAAAAAAAAAAAAAAAAAAAAATTC^^ 
TCAAGACAAATTCAATATTATAGCAAAGGTTCAAGTAATG 
5 AAAAAAAAAAAAAQGCCACCAAGGACCTCATXnTAAAGTrcATGA 
ACTAACTTTTATrTTTACATTATTTTAACTAGTATT^ 

CATAGGCCGGCAGTGGCTATAGGCCAGGACCIXnXXrrATGTAGC^ . 

ACCGAAQGAGCTTAGAGGGAATCAGTGACGTGGACAAGGCTAT^ 

CAACGGACTGAAGTGCCCAAGCTCTAAATCCCTGC^ 
10 GTTGTTITrATAGCAGCAACACAGTAAGTTGGGAAGAGGGA^ 

GTTGACCACTTAGAAATGTCAOSAGTGTCTGTGAATAGT^^ 

TCH'ACAAGTAGCAATGGAAACAAACTTGTTAAAAAAAATTTAATT^ 

AAATATTTGATATITAAGATTGAGCACnTAACAAAATACTT^^ 

ATICTTCmXX:AGGGATACCTGATAGAGCTAATATAGA^ 
15 GGTATGTAGTTCATTTTTATAAAAAAATTTAAAAATrcT^ 

TGATGACTCITAACTW^GGATGTAAGACAGATTOT 

TCTGTAAAAATTATCATGCCCATTCATTATTTAAAGTQCT^^ 

ACATGATTTAAAATATGTAATGAAATTTGCAATGCATCAGAT^ 

TCACCCTTATGAAGAGrrACTTGCATTGTTTG^ 
20 CACTGCGTTAAATGAlKXrrACTGGCAGGATCG^ 

GAATCTTCTGTCXXXnTAAGGGAATTGGCAGTGTT^ 

CCTTTTTTGTTTTGriTTCrrGT^ 

GAACAGGATTATCCAGAGCnAGCAAAAGCTCTCCT^^ 

ITTCATTCy^GaTCAGATTTGCTACTTGT^ 
25 TGCACCAACCAGGTATTGCITCGTGTC^^ 

AATAGAAAAACTGTTGATTCTITTTTTCATTCAC^^ 

AACAGCCGTTTTTATAAGACTCTGCATTrACAACATT^^ 

TCGTGTGOXnXSGCTCATGCCTGTAATCCC^ 

GACCAGCTTGGCCAACATGGTGAAACCICATCTCTAC^ 
30 TAGTCCCAGCTACTTGGGAGGCTGAGGCAGGAGAATGGCCnX^T^^ 

CTACIX3CACTCCAGCCTGAGCGTGAGACTOT 

TGTAATCACAGCTATCCAGAGGCIX^GGCAGCAGAACTGC^^ 

GCCACTGCACTCCAGCCTGGGTGACAAAAAAAAAACTTTAa 

CXSGATCCCCITGTTTCTAGAAGAGAAATCC^^ 
35 TGTTTTATTATATATTCTGCCCACTTCTATTCC^ 

TTTItSATACAGGGTCTTACTGTCTTATTCAGGCTAGAOT 

GGCTCAAGCAATCCTCTGGCCTCAGCCTCC^ 

ATTTTTTATAGAGACAAGGTTICCCTATGTrGCCCAGGCr^^ 

TCTXIIACAAAGTGCTGGGATTACAGGTACAAGCCACCATGT^^ 
40 TATTTTCTTATATTTACAGCATATGTCX?rcAT^ 

TTTTAGCATTAATATTAGGGAAGATAATCTGACATTTAAAAATACAT^^ 

TATAAATACCAGTACnAGTGTCTGCTAATAGGTGGCGCCACTT^^ 

AGGTTCGCAAAGGGGACAGGAGACTGTGGGATGATTAGTATATTCT^^ 

TCXTU^CAATATTCATTCTCTGTGTCyVCCTC^ 
45 GTTGGTCCTATTTVATTACAACCTAGAGGGATTC^^ 

GACTTTGGAAGGACTGATCCXXAAGGGCCTTC^ 

GTTIXXCATAACCTTAGAGATTATCTAGGCXAAGGC^^ 

ACKTTCCTAAGTTCACACTGAAAGAGAGTGGTCT^ 

TAACGGACATGGTATCCCCAGCATTTTTAATAAATGTO 
50 AGTQGAGCGTGATTTCATCTTTCATCAAGAGAAGCA^ 

AAGGTCATGCGGTATCTGOrATTITXXrATATTCT^^ 

TGGTATATGGCTTAATTGTAAGCGATCAGGCATCAATQGCT^^ 

CATTATTATAGATTGCTGATGTTTOCCAATGGATAT^^ 

AATAATACCAGTGACATCTOTGCATTTACATTCAGCC^^ 
55 ATTITATTTTATTTTAATTCTTTTTCAGACA 

GCICACTGCAGTCTTGACCACCCrrcGC^^ 

(XAGCJiCACCCAGCTAATTTTTTGTATTAT^ . 
GGGCTCAAGCAATCCGGCCACCTTGGCCTCCCAAA 

60 CTQGGCTCACTGCAACCTffrcCXrr^^ 

TGCACCAOIACACCCAGCTAGTTTTTCTATTnTAGT 
CTGACCIXriCCTCX3GCCTTGCAAAGTG^ 
TAATAACCTGTCTTCAGTCrrACTTTTTGTC^^ 
CTCCAACCXTCAATGTGACnX3ATACX:AACAC^ 

65 

GAATGACTCAGACTreGAarOTGAAAGAGGGAGAACTGATGTGGCC^ 

TGCirrTCAAGAACTAAAGAGTAGACTTGACCAC^^ 

CCACAAGATGGAGTGATCTTTACGCCATXXr^^ 

TTGCTCTGTTACGTCAGGACACATCACTTQC^ 

AATAGGCCCTCCAGATATCAGTAGTGACAGGGGTATCAGTA^ 



CTACTGCACAAGTTTCATACTOIAAGA 

GCTTTGGAAGCITGATGCCATAACAAAAGTGCAGAAT^^ 

ATACAGATTTOVCTATCXATGAATTGTCXXXTTAA^^ 

TCTAGCTTCTCCTGGGCATCXriCTATTTGA^ 

CGTTTTCXrAAATGAGGAACTCATCGGTAAGGCCAG^ 

GAAAAAAAGAAAGAGCGTATTTATAGATGCTAGAGAATTXTTOGGCATX^^ 

TAGGTGGGACTGCTGATCTOnTTTATGAAAGAAATAAA^^ 

TATGTCTAACCCAGTCACAGGTAAGGCAGCCAGGAGC^^ 

GCACT^GCTCAAAAACCCTCACT^GGTCACGCC^^ 

ACATGCCCTATGAATTATACCTTATTTATGATCGTCGAGC^ . 

cagtttitaatactctarrrattttaggggaggaaaatgttit^ 
cagtttccxx:agcatttaaggaaatattgttgtaagac^^ 

CAATGGAAGGACXCTCAACAGACCATTT^ 

GTACCTTTITACATATGCACATrATGAACrrTT^ 

GCTTAACAAGTTCCTCnXSAGTATTACAGTTGGAa 

AGAATCTCA GCGA TGATCCrAAATATCTITCT^^ 

TAOIlAGGAGCTTTCAAAGTCTGTGATTGAra^^ 

GCAAGAGAACAGTGATTrAATGAAGTGTCTAGTGAGCCGGCXAT^ 

GCCAGCATAAAGGAAACTGCACTCACACTGCTATTAAGCAGOT^ 

ATCTGAGCAAAAGCAATGahTACCTt^^ 

<3GGAGATCAGAGGTCXOTCrrcCAAGTC^^ 

TAATTGGTGATrAGAACTOGAAAGGrnx:AATAACATCTT^^ 

AAACnCGTCCTQCCGCXnX3GACrAGTTACXX^ 

CATAGACAACATOTAAATTITIXTrTGCTIT^^ 

CTTAGGACATACTACAGTAACCTTTTCCCACCA^^ 

TGCAGGCGCAGT^CATTTATGCAC^ 

CAGGTCTTTCnTTCATGTCAGATCCX^TATGCC^ 

TGAAAACGCGAAAGAAAATAAACAGGCAGCACTGAGGAATTATACAC^^ 

TTTAGGTGCACTGCGCATGTTGGAGGTGCCCTT^ 

AAATTCTTACCCGATGCTTAATGAATTGCCAGGCT^ 

TTTATATAGACTC^GATATTTCCATXX:CTCGCAGCAAAATT^ 

GTCXIAGTTCnXSCTCAGTAGCTGTCATTAT^^ 

TCAGGATGCATAGTAGAGAAAAATGAGATGAGGTAATCAGATCACATACA 

AGGTTCTCAATAAATCriTTriTGATK^ 

TCCATAAAACTTGQGTAATTCTAATGCTTACTTCAT^^ 

GAGTTTAAAGAAGTGTTTTATGAQ 

CCTTITCTAGTATGCACCTTTGCXX^CT^ 

GAGTCTTTCTCOVAGGGAACAGAAGTCriTAGGTT^^ 

TGTAGAGGAGATGAAGCTGATCAGAGCACACCCACACCAG 

ATCTGAGAGATGTGGCATGAGGAACACrcAGACCTAATGC^ 

CTAGTACTAAGCAGAATCGAAGGGAAGGATGGAAGGAAGAGAGGGAGAGACXSGA^ 

GGAGTGAGGCAAAAAGAAATAGGGAGAAGTATAGAGAGGAAAGAGGGGAAGQGAGGAAGGA^ 

AAGAGAGGGAAGTAGGCACAGAGGTAGAGAGGCAGGGAGATAATGGAfiATCC^^ 

TAGTCCACAGAAAGCACCTTGACCAAGTGACTGAATACAAATC^ 

TGTGCCATATITCXnTCAATCrrT^ 

CTATGTCAGAAGCAAGTCAGGATTTGAAGGCAGATCT^^ 

TCTT^GTTOCCCCCACC€XnXX:AGCAATTC^ 

CCTTTTTGGAGAAATGGCXACTGCAACTCT^^ 

AAAGAAGAAATAAAACATATTACIXXATTOCCATACTTC^^ 

GAGATTTACTGAACICAGTCGTTTCACCTGG 

CATCTTACATQGCAGCAGACAAGAAAGCGTGTGCAGGGGAAC^^ 

CACTATCAAGGGAACAGCACAGGCAGATCCCATCGCTT^^ 

GAGTTCTGGGAACTACAGTTCAAGATGAGATTTCGGT^ 

TTTCTAAGGCAGATGAATAATCCTGGTKXSAAAT^ 

GAATGTGGCCCTGAGGGACSGATTTXnCT^^ 

CTCTCCAACTCCTAAGCTX^ 

CAGCCTGGGGGAGGGTTTGTTAAAGAACTGri^^ 

TGATACCTTTTCATTAGTTACTCACATGGACCX:^ 

CACCTATAAAACCIOTTITAGTAATTTACTCXXr^^ 

GC XXJlGGTA AATAATACCCACCATCAGAGGGCACXaGTAg 

GATITTTTCTGAATAACTGATCTTTA^ 

ACACATTAAACACTAAGAGGTCACATAGGATATTCCATT^ 

TACGCATCGGTGGTTTCTGGGGCGGCGGGGAGTATGGAAAGC^^ 

TTTTGTTTlTTrrTTGAGAra 

TCOGCCTCCC^GTTAATGCCATK^ 

CTAATTTTTTGTGTTTTTAGTAGAGACGGGGTTT^^ 

CCTGCCTTCGCXnCCCAAAGTGCTGGGATT^^ 

TGGTCAAACTGTlTrcGAACTAGAGGTGATAGCT^ 

AAAOT^TTTATCITATCnXSAATTTCATCTCAGCA^ 



ATTAAGCTGCTGTTCTKnTITTTAAATATTCCAGCT^ 

TGAGGTGGGTGGATCACrrcAGGTCAGGAGTIX^ 

CCGTCTCTACTATAAATACAAAAATTAGCCAGGCTTGC^^ 

GGAGAATCACTTGAACCXXyVGAGGCAGAGGTTGCAG^ 

GAGACrCCGTCTCAAAAAAAAAAGCACAAAAAATTTT^^ 

CAGGAAATAATTGGGTTGTATGTAATGCCCAGTAAAGTAAACAGAT^^ 

GGTTATICAGAGAGCCATGGGGACACTAAACGTACCTTAI^^ 

TTATTAATAATGTATACATTGAAAAATCTATGTGCAT^^ 

ATTITXXriTCTITTTAAAAAAAATTITTCAAAG 

GC in ' iU ' n CTGATCACCACCCATGTTGATTAGATATGCATCT 

CTGTATATACTTGTATGCGCACATACACATATTT^^ 

GAGGTGAGGAATATTTGAATACAAGTCTTATCTAGTAGATTAAAAC^^ 

AGGACXTCriTITCATTGTTCTlTTATTCOT 

GAATGGTCITTTCTAAAATCACTTTGAAAACTC^ 

ACKSTAACTTGGCTTTCATTAGAGTAGAAAGAGATAAAGC^^ 

TAAGGAAAAAGGGGACTGAAATTAGTGCTGATAAAAGCTrcAA^ 

ACTTAGAATTTGGAAGAACATACTTCATACAAAGCT^ 

TAAAATGACTCCMACTCAGTGATACTGATCACXrC^^ 

TCTGACATCTGTTCACAAATGGGCAATCnTGC^^ 

AAAATTCTCTCTGCTCTITrCCAAC^^ 

AlXX^CCTCTCGGAGTCCAAGAACTAC^ 

TGCTCAGCTTCIxrTTTTCTGGAGAC^^ 

TTTTTTTTAAACTTCATGGTGAGCAGAACAC^ 

ACCAGCTTCCTTCTCACCTrAACCCCACG^ 

GAGTTCACACTGAGTTOSCAGATTCATCCTT^^ 

GGTCCATTGTCGGCCCTCnTAGAAGAGATITCAf^^ 

GAAAAAGCACATGACnTCGAATAAAGACTrAGTACAC^ 

ACCnTACTGCACCTGTTTGAGCCICAGTrTTC 

GGATCAAATGAGTTGGAGTGGGATACAGCT^CAATCTATATKr^ 

TATTACACCACATCTTCTCnxSGGTCTrT^ 

CTCTACCCCAAAAGTATATTCAAGAGCGAAACTTTTGC?^^ 

AGCTAT L ^ rriV i U XSAGTClCTCAAAAGACAmTGGATQ^ 

CTTACTOnTTTCAAATACTCAAATGCCATTGAC^^ 

GAAGTCACCACAaXXriTGTTATGGAAATCGTCAGG^ 

ACCCATCAAGATATCCACCTGTAAAATATTTAAC^^ 

TAAGAGCAGAACAAGGAAAATATAGCTGACTATAAGTTACAAAGGCAGACT^^ 

GAGCACITGGAGAAGGGCrxyWXX:AAACCAAAA(^^ 

CGCAGrrcAGGGTGATGATGGTCn*rACGAATTACCC^ 

GCAGAAGGGCTITTQGATAAAAGGQGTCTAGTTAAGAGAT(OT 

ACATGTCTGAGTACCGCGCAGATAGTGGGGAATAGTXrr 

GTTGTGTTTTTTTrcCTCrcATT^ 

GTGCTAGCCAGCCACAGGGCCAAGTGACAarrAGAGGTGAT^ 

ATCTGTAACCTGGGATTAGATAGAGCCAATAAGATC?^^ 

GGTACGTTATTCAAGGTGATCCTAGCTGCTATAACAAATAAGC^ 

ATTTCTTGTITATOTAAGATCCATAATGACTCT^^ 

A(XrTCriTTTCATCrrTGTGGCTCAt^ 

ggaagaagtaggaatttgcaggaccatgtctggagcatttt 
aggaggctaggaaacagagcataaccttc?kxxx:aggaag^ 

TTTGTlTAAAATCAeiTTQGAlTC^^ 

TAACTAAGCATGCITOriTCCCTCT^ 

TATATGTATGTATGCATGTATGTATC?TATGITTGA»^ 

TCTTGACrrACTCAAACXriCCTCXr^^ 

ACGCCACXACACCTGGCrAATITTTG^ 

CTGACCTCAGGTGATTGACCTACCTCGGCC^^ 

ACOCCACTGCTATTGTCCAGTC?!^^ 

ACTTCACAGCCITCCTAGCAACTGAGGATO 

TrrCCTTGAATACCnTTTGCTTTCn^VTAAAAT^ 

GATTTCATGCTICCT^GCCTTAGCGGTTTTC^^ 

AGGTCCTCACGTTGCAGAACACTGACCm 

TTCCCTCTCCTGCCAATTCTTGAGCTGCT^ 

TAQCXXnCGGAGATGAGGATGCTTIOrGAATGrE^^ 

TAATTAAGOTITGCTTCAAATGGGGAACAGTCCT^^ 

CTCCATCGTITGGATCriTTTCACCAG^ 

ccatgaaatc-ictgcacx:gatagttcaat^^ 

CACATTTATArrAGGGTAATGGCCTCATCSCTATTT^ 

AAGCAATAGTCXXnXSAACATCATTTGGCCnGAACT-^^ 

TGCCTAGGAGTGGTGTCATATCATGAAGTTWSAT^ 

AATTTTTITTTCTATTCACCAAAATATIXX^^ 

TGCAAAATCCTTCTTTAGAAATGTIACXnT^^ 



ttaggttot'atgatgcxaatttgcccacaca™^ 

TCAAGGTTAAAACAGGAAAAACTAACACAGTAAATACCT 
AGCACCTATACAAGGAACTGGCTGGCATATOSCATTC^^ 
GTOGGAATGTGGTCAGCXriCTTAT^^ 
5 TTTTTAAATAATGAAAAACTriXSGGGCCTTGT^ 
TGTTAGTAGAAACTAATITCTCrrGCrTTT^^ 
GAAAAAAATGATGAGCCGGGCACAGTGGCTCACGCCTO^^ 
AGCnCAGGAGTTI^GACCAGCCTGACCAACATGTT^^ 
TGOTACACAGCTOTAGTCCCAGCTACTTQGGAC5GCT^^ 

10 AGCCGAGATOTXXrATTGTITCT^ 

GAATTATGTCATTTCGrixnCACAQC^^ 
CCAGTCTC?IX3CnTTftGCACGGGAAAAAa:^ 
TGCAGGAGAACATOTTTAAATTGCrcAAAAA^ 
AACTTGAGTCnCCITAAGTAAAGTTTTAT^ 

15 <^<nCTCA(nCTGTCACCXAGGCTA£^^ 
ATTGTCCTQCCTCyySCCTC^ 

GAGACAGGATTTCACCATXnTCGCCAGGCTAGTC^^ 

TGCTAGGATTACAGACATGAGCCGACCITITrAT^ 

AGGTTApAGCACAGCATGAAGAGCTGACAATATTGA^^ 
2U AGATACACTTATCTCnCAGCAGACGTGATTCAGGGGAG^ 

ATGTTCCC^CAGTGCIGATATGGA^ 

CCTACICTTTACTCa:^^ 

<^'rG*^CAACAAGAGTCACCTCT 

<^TTCnTXnTC(XXXmX3TTACC^ 
23 GGAACTTCriCTTCGAAATGAATAGTCT^^ 

TGATAATAAGAATAAAAGAAAATGAAGTGACATTX3AGGACGTCT^^ 

CGCTGTAAAATGACTGTTACACGTTTATGGATTGAAATQ^^ 

ATTACTAACACCTCTCACATTGTTAGAAAACAGTT^ 

AAAGTTCTCATITCATGCXriTAGGAATCCCC^^ 
30 TTGCACATCTTACCCCGAlXXrACTCr^ 

II^T TTCT CAGATTGGTTTAGGTAGAAAATAA^ 

TTTICTTTAAAGCTGCTGTTACriX^^ 

'I^ACATGCTATAC^ 

AGTCAGTTTTCTTACCIGAAGGAGAAAGCCTGCA 
35 CTC TGCGA^XIXCT AGTCTGCTC^^ 

•i-i-i-riv-lu-ACAAATACAi-iUU-iCAAAATGAGTAAACATATCT 
GAGATTCATCTTTGTAGCCACCAAGAGCCACTAACAG^^ 
AAATCIxnXSACnXiATAAAACCTUnXXrr^ 
40 AAAGGCTGGCTCATTAATAGGATATGTCTAGAGTTCTGCTT^ 
TG<^GCTGCTGCTGCTTCT^ 
CTCTCXTTGTTCACCCAGATTGGCTG^ 
QGATAAACIXX3GATTTX3IJ^TATTTTGC^^ 
TAAGGTGGTCCTTTATAAAGTTTAGACCTATTTTAAT^ 
45 CrTCATGGACAATGAGTACTCACTCTACTTTT^ 
TACCCAGGCTGGAGTGCAGTGACATGATCTCGO 
GCCTCCCCAGTAGOT3GGATTACAGGCATGTG^ 
CATGTTAGCCAG GCK3GT CTTGAACIC^^ 
TTAAAAClOT^CTTTTAAAAATTAGAACCn^ 
50 TGAGAAGAGTTTCGATTTGGA OCTA CCTAGG n ^^ 
CTGTAAAATGATCACTCCCXy^GTT^ 
CAAATCTTGCC^^ 
TGGTITTC^TTTGCATTTCTCr^ 
TTTGAGAAGTGriraSATCATATCXriTC^ 
55 GGflCAAAAAACCAAACACCXXATCmCT^^ 

CATCACACACCXXXXXXnxnTOroGGGTGC^^ 
GTTAATGGGTGCAGCACACCAACACAGCACATG^ 

TGAAAATATAATAAAAAATACATATTAAAAAAAAAAGAACTGGAGCATCG^ 
CATCATCACAGAGGATGAAAAGTCTCCTT^ 
AAJfiCTGAAATG^^ 
TGTITTGTTCGTTTTTC^^ 
TTCICCTTTCX?OTCCT^^ 

TTGTGTCTQCGei-i-i-lvriGGTAATTAAATAAATTAT^ 
^ GGCACTCAAAAGCATAAGAGCCCT^^ 
55 CrriTAGTGTAATCGAAGCAGCAAGAGTAGTCACAa 

AGQGGATGATCrrTCTAGTCTAGATTACCTATTGAT^^ 

A^^CT^CAAACCTT^^ 

TTXriTTTTTrAAGTXnTAClCTOT 

AACXTCACCTCTGGGCTTGCTTAAAACTAAC^^ 
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AGGATTATGGAAAAAATAATAAAACTAATTTCCAGGGGAGAAT^ 

ttaatagtgaaatcatatcatatatataaatcatattttagcctat;^ 

CITCATGTAAAACCATGriTACGTGCX3GATAAra^ 

GAAGTCAAACACATTAAATGGTGGGTTTCATCCAAAAAAT^^ 

ATTAGTGGGAGGAAGTATAAAAGATATGGAAAAAGATATTCTGGTTAT^^ 

AGGGGAAAG?^AAAATATAAAAGCIXXAATAGGTTTTTCTATT^^ 

ATCCCCTAAAGAAATATIX^AATCAATAAGAACACAAAATATGGAATA 

TAGCACAGAATGAAAAACACAGCAAfOTATGTACGGTTTAAAGTG^^ 

GATAAATACAAGTAGTAGCATTTTTAGTAAAGGCTGriTACAAAATAC^ 

TCATTAATGTAGAAAACAAACACAGTGTATCCATCTTC^^ 

AATAACTCAGCTCATTTGATCTXSGTTCT^^ 

TTTCAGACTCOGGAAAGAAAAAGCGATGATAATGACGTAT^^ 

AAGAACAAATCGCTTGGCAGCCTACTGTCCTC^^ 

AGCCTTO:CACXriTrGGAAGCTGCATTAATT^^ 

ggcaggaaagagttttaatgcaggctcxaaaggtgci^ 

agcaatctagcagttcaaactagacagctggttct^ 

ctctcccctitccttggacaagccct^ 

attixsgttttgatattoxaicatqgcc^^ 

atcaacaaggcatattgcx:atgactgtaggaatcaccaati^ 

tagcattactgttttcctgaatcn*aggagtatit^^ 

gtgttttgttttctttctcnttgga 

ATGTCnCTGGCACCAem:AGACAGTAAGGATAT^ 

GAGAGAAAGATTAAAACATATCCAAATAATTTTGATAGAAGCAAA^ 

GAAlXXnTTCATTGTAATCTTTTAAAGCTT^^ 

TAGATGCAAATATQGTAGTACTGTGAAGATrATGGTTTTO 

AOTTTGATAGTGAAAATGGAGATAAGACATCTCCCAr^ 

AAACreTTATAGGAATGATTGACATGGAGGTTCATC^^ 

GACAOaAAATATAGTGACAGCOTCTGGCXrrGGG^ 

GATAAGATTCAAGACCCAAAACTATGAATTOGAAAGCCAGGCTGAGGC^^ 

ACCCAGCGGGGGTGCATIXSTTCACTGCCCCXXriCC^^ 

AGAGAAAGGGAGTGTAAAGGmGAGGATTCCATGGCTl^ 

TCATACCTTCTCGAAGGCAACTAGGTGGAITACAGAATTG^^ 

GAGCriCACTGTCXAATATGGTAGCCACGTGTXSACTATTAAA^ 

AATAGCAGCCACATITCAAATCriTCrGTAGC^ 

ttctaccatagtagaatgcictittggatagatct^^ 

gttccxsagagaaaacattataaacatclxnttatc^ 

gggtcgcaggtactcntttaagtgtwrgcatatatttacat^ 

gttgtcx:atatttcacagatgaggaaagagatgtgct 

tttcagggtcccatacttttcaatgtaagcatattqg^^ 

gcactgacacaggtaccotcixxagtcxtc^^ 

gccatttcoccagggagagggtccatgtixxsaaccti^^ 



TTACNAACATTOnGCAAAAAGATCAGTXnTTG^ 

TACTTCACCTACTCCTAATAAAAGCCCITCATCn^^ 

GAAAAACAAAAATCTAAATTAGCAAAGAGAAATTTGAAATia:AT>^ 

TACCATTCCIX3GAGGTAGTAGCTAATAG?^TGATCC^^ 

CCCTTAATTTTTCTAAGTGATGAAAGCGT^^ 

AAAAAGACTCTGTGGCCTCnxnCTCl^^ 

aggatatxxtttxiaatagaggtgaaaatcctgaaataa;^ 

CATC3GCATAGAlXXnX3AAClTTTGATt^^ 

TCGCCTCCCACCACXXXriT?ATTTAAAGGGAT^ 

CCTAGlTCATQGAAlTGrrcAAAAAAAAAAAAAAGATAGAATGl^^ 

CAAGAATGCTATGACTTTTGCTCAGGGC^^^ 

TGTGCGCGCGCCTGCATGTGTACTGTGGATC^ 

GAACCTTTTCTATTAATCOTCTTGGGGOT 

TACCCAAAGAOTIOSTAACriTTCTC^^ 

TTTCATACACATTTCAAGGACTTAAGAAATCT^^ 

TGGGGAATAGGTGGATATTATftCATTTATTTAGGIXXS^ 

CACAGCCCTCATTAGTGAAGCAGAGGTTGAAGGAGGTC^^ 

TTATTATTTrriXSAGACAGAGTCnTQCrn:^^ 

AGTTAGACACCGGGTTTCACCAT^ 

GAAAATGCTGQGATGACAGGCGTGAGCKXXri^^ 

TTGAATACTTACTGGGaxrKnGCTAAGTGCT^^ 

AATTATCCTGATTTOXAGGCAGACGAAACrc^^ 

TGGCAGAtXACTTATTQGCXXrrAAGACTGTT^^ 

TTTCAlTTITCriTCGGAAGGATCTATC^^ 

ATGGTACAATC1CTTAAAA£X3CATGAAAAAGAAACAGTCAT^^ 

CACTGAGATCTATGCATGTTGTGGGCtXACTIT^ 



AGJ^TGAAGGGCCTCTTCAACCTACCTTrTGT^ 

TCTCGATAlTCATGAGTCnXOTAGACTGCTT^^ 

TTTTGATTTGATCnTGAAGGTCATTTCAAAAAT^ 

TGTCACTTTACTTOGAAGAGAOGGGTCriCAGG^ 

GGCTGACATGGAAGATT'XnUU'rXUTn"rAATACnTTTC^^ 

AGTTTCTTAATGGTCTTTAGCAAAGGAAGTAAGTGACGC^^ 

GCTAGCTGATTGCCAACAGAAaXXXXrrTT^^ 

ACCCCCTTCACATACTTATACTGTGGGCTGCATC^^ 

AATAAlXnxntXXXrAATCnCAAAAGATACTAAT^^ 

ACCTITGGAAATCAAGGCTCTTCTGGTTTAT^ 

CAGCATCATGATGCAAGCATrcACTTTGCTCACMA;^^ 

GCTtTKiACAGAACACXnGGCCCXrrAGT^^ 

GACATAATTTTTGTTTTXrrcTTAAAGGGTCAGTTATA 

CITTGAAGGAATCAAAATTAACTTGAATACATTCnCTI^^ 

TAGGAGCTQCTCTTAATCATTATTACTGCATAGGCT^^ 

CITTAGGCCTTGGCATGTATTAAAGTACTATACATTTGT^^ 

TTTGTTCACAGATTTTTITTTCCAACT 

TTTGAATTGGTCGATATACrcTCTAACTOn^ 

GATTAATCATTACAGTAAGAClTATTAGCTCXIATTrATATC^^ 

GTATACTTAAGCAGAAGTTGGGITTTACGTTACAATTTM 

GTTGAGTGGATGTGGGGCTTTAAAGGCAGGAGTCTC^ 

TCTXXCAAAATCAAATACAGGAAGTCXiAC^ 

AGTCGAGGTTCTGGAAGATGGGTCGGGTGGGGGGAAGC^ 

TTACATCAAGGCTGGGAGGAGGTCTTTCCACT^ 

CACGCTCTGCTCTCXXXrrcACTrcC^ 

TGAAACTC(XTAGGATGAATGTTGGAAGTCXri^^ 

GGTTTAGGTTTACATGTACITTAAAAAAAAAAATTAAT?^AAGTITAT^^ 

AGTGGGAAGrrcTAGAGTTTCATATACTCCCK?!^^ 

CTACATGTTCATTGAAAATGATCriTTTGGGTGGT^ 

TITTCACIXXXrrAAGGGACCCTCTTTTGOT 

GGAACCAAACTCTCTTTATTCCTCCCGTTTGC^^ 

CrCCTnx:AAACC?ITATTCX:ACAACATACTAAATT;^ 

ACAGCACGCArTTGTTCAGCTCTGTGTTGA^ 

TGCTTCTGATTTGGGGCAGCACCTTTGCTT^^ 

AGGATCCCTtrrcCXXT^GCATGGGGCrrcATACAT^^ 

TTTTGCXSAGCCTTATTCAGAATGTCAGCATGGATC^^ 

ATACGTTATTTATGGTCITTAATTATTATTGTACAGAaTm 

AAAAGAGTTTQCTAACTTTGAATCCTAGATGTTTTATC^^ 

GTGTCTATGTGAGGGGGTGTGTGGOGGTATGCaSC^^ 

AACCCTCTTTTGTAGTATTIKSCTGATTACTGT^^ 

TGGAGTOXSAAAGAGATGCGTAGTCCCATACCATTT^ 

TAACCTCATGAGCATAGATAATAGTCACATCrrACTAAGATTAT^^ 

AATATAGCTTATTGAATCATACATTTATCTACTTTAATAC^ 

TGATAGTTTAQCAGTGTGCTCTCTTAAGCCAGTCGG^ 

TTTTGTTTTGTTTTCTTTTOITra 

GTGGTGTGATCrcCGCTCAGTGCAGC^^ 

TTACAGGTGCATGCCAiXATGCCCAGCTGAT^^ 

CTTGAACTXXrrcACTTCGTGATCX;^^ 

CATTGTCTGTCAQGrrTTAAATTXX:ACCT^^ 

GGQCATCXXrKX:ACAATTTrcCGCT 

CATGTACCCTGCnTGGTACATGCCAGATGATT^^ 

GAGATGATTCnATAACAACAGCATGGTGACAlx:^ 

TTATCrmTTTCTCAGTTTri^^ 

TrrcCCTGAATGACTAAAQCAGAGATTTCGTTA 

GCATTATTTCXXrAATTCCGTATCOCATCTAC^^ 

GGAGTCTCrrTATTAATGTTCCGaCTTTGCATGGr^^ 

GGGTCXXXrnnTATOrATTTCGGACTTAC^^ 

GGAAGGTCGGTCGITTGTACTGGGCATCTXnXXr^ 

TAOCATClCTtXX:AG17^GQGATAATAAAAACAT^^ 

CCCTGGTTGGAAACTAGriXnTCCAGAGCTA^ 

CrTCCAGGAACXACCAATCCCCXIAACA^ 

AAGCTGGACTQCAGAAAGAAAAGCAGTATTTOTA(^^ 

GGAGTCCTCMAAATGGGGCTTCTAAAATAAGTGGM 

GCCTATTITCTGAGCTGTCGAGCAGGGAG^ 

CTGCIACCTTTXXSCTGGGGATGGAGGGA^ 

AAGTITOGCM'AGTCITAGCAGTTAGICC^ 

TGAAATAGGCnx::AGTCTAG(rrTATGTCTAGGl^^ 

ATTGTGTAAAACCTGCACAGTGGTACACCTGAGC^^ 

aXTCTTTAACXMAAGACTTTCTGATTrrrc^ 



rroXXXXrrACCTGGAGCCIGAGTOrrCAGG^ 

CTCTGAGTCATTAGGCGTCATCTGAGACAG^^ 

TCTTCATGGGOCATTAGGAACTGTGTGACGCTGGGGCAT^^ 

ATAGGTOTAATGATATCATTATGGCAGATCAGTGGGACGATTGAGT^^ 

GGCTTICAGTGGGCCACATCriX?ITGTCAC^^ 

AGGCTGCCTTGTTGCCTCCTXXATGTAC^^ 

CCCTTTCAGAAACAClTCAGGAGATCyATAGTTAT^ 

CAGATCAATAGTGAAAGCAGGTrcTGTTAAGTriTlTATAGGA 

GAAGTCAGTCTAACCAQGATGATAATGAAGAGTTAGGTCAAGra 

TAAGAACAGATGTTCTCrcCAGACCCCCAGGT^ 

TTTTTCCTCTCTTTACTTGGCCAAATAT^^ 

CAAAATCnCTTTTGAGAGAAAACAGCAGGGACATTT^^ 

GTTTTCATTCTCCCCTTAGCAGCTTrCAAAT^ 

ACAGGGCTTCTATTTCCAGAACCTTCTrAACTC^^ 

ATGGAAATGGCTGTC3GGGCGCCTGAGCIXK3^^ 

GGGGAGTTTCGAAACCCTOCTAACATGTCATCT^ 

CTGATAAGCAAAGGGTGTATCACTGGTATGGGTAACTGT^ 

AATtmTATATTCCACTGGGGTAAOriCTAACAAA^ 

TTGAAACTGGCAATITACATrcTTCTTAAAATACXXSAGC^ 

GAGACAATTTCTAGlTCTGTTTATTTTTi^^ 

TGTCGTGGeGTTGGCTTTTA(nTGGGGTTGAC^^ 

GGTCCXSlTGATCATCAACCAAGa^CCAAGAAGCCnT^ 

ACACACAAATAACTCAGCXnTTAGTAACITT^^ 

GTCXACATCnxnTrATGCAGTTACCTTlCT^ 

TGATCATTGTTCn'AGriTCCCGCAAATCTAT^ 

CTAGTTTCAGACTTCTGGCTTGTTXTK^^ 

AACAAAGGATCXrTCTGATTGCrCAGATGTC^^ 

GTTCTCX:xriCCIXX:AGCAAACTGCAl^^ 

ACCCAGCAAACGTGGCTTCTCTTO^ 

TCACCATTTGCTCTACTGCAAAACAC^CTC^ 

AGCATCACGGGCJ^ACTOGGCTTGGCTCACCAGCC^^ 

TCTCCCTCTKXXrCAACTTCATGC^^ 

CTCCCCGAGGCTGOnXXXATGCTCGTGCT^ 

CCTCATATClTCCACAAAGCCrTCCrAGACCTTTAC^ 

CCATGAGTGCTTTTGTATGGTGACTGTCTGGTTAC^ 

CAAGTTTTATTTCCTGTIT'AACXrrTGGCCTG^ 

TTTCCOCACACrcATGTGTATTTAAGCrrC^ 

rrGCTGTCACaiAGGCTC3GAGTGCAGTGGCAG^^ 

TACCTCAGCCTCCTGAGTAGCTGGGACTACAGTTCSTGC^ 

ATAAGAGGGTQCATTGTGTACCTGTCXX^AGTGATTACAGAATC^ 

TATTTGTAAGCTGGC^GATGGGAGCTGAATITCA^ 

TTKXIAGAGGAGGGGGGCACAGATGCCTCCAGTGGCCC^ 

CTCTTTTCATTACCCTTATCAAGATOCTG^ 

GTCCAGGTGGATTITAAGCATGGAAGGTGACACTT^^ 

TlXSGAGTTAGTCTCrcOGTlTTCGAAAe^ 

AACCTGGAAGATCAGTGGCGTTTCGACACACAG^^ 

GGATGAAGCACAAAAOCTCAGCXACAATGCAGTATAGGTTOG^ 

ACTGrrATAATTTTTTTCTTCTITTOT 

CATTTGTGACTCTAGCmTTAAGCTTCaW^ 

TCAGCATCCAATCCACATTGGACTGT^ 

GCAGGAGACAGTCCXrTCAiXCTGGGGACTAAGCC^^ 

CTTTTCAAAGAAGACTGAATTTATGATGAAACTT^ 

TCTGGATAATTCTCAATGACTTTGAAAG 

GTGTX?rTTGTTTTCXZACCATTTGTAG(^^ 

CTOGAQOCTCGAGJ^CGGCAAGTACCCAG^ 

AATCAGTCATGCTTAGCATACTACTCCAAGTAGTCAT^ 

CCGGTAAAGTGACGTTTTCATGAAAACAGC^ 

AGTTACTTAATOGAATAATACAlxmrCCT^^ 

AGACACATACAAAAATGTCCTATCTTAAGAATOT^^ 

TCGAAAGAAAGTTCACACrroCAAGATC^ 

TGGGAACATATCAGTCACATCAAGCnAGGT^^ 

CAGAAGAGCAATTGCATTTGGTCATICTAGTATCAAC^^ 

TTOTTTTTACAAGTTCrcATTAAT^ 

TATATGTATAGAAGCGTATAATCCACAGCACTCCT^^ 

CATTGTAGAGATGAGAAAACTGAGGCTITACAGAA 

ACACTTCCCTACTAGGTAGAGCAATC^ 
TAAAGACATTCCTCAGACTGGGCAATTTATAAAGAAC^^ 



TGCCTCriXXXXSAGGCCTCTWS^^ 

ACTGGAGGAAGAGAGAAAGTAGGGAGGTajTAGATGCITITA^ 
ACTAGGGGGATGGTGATTAGAAAGCACCTCATX5ATCXX:X3^^ 
ACACCTTTGTGAGGTCeXriOnXXAACAC^^ 
5 ACTCXSAAGGTAAAAAAGTTGGCAACCAGAGGCTrAGAGGAGGr^^ 
GATGCmAAOA^ 
TCTC?IXXrrTOTX3AAATTTO^^ 
GAACCCXACACAGGTAOCTIXXyAAGACTGCAT^^ 
TTTTTTXSAGGAAAATAAATTTAAGGCTTQGTTATCG^ 
10 CeCAATCTGTATGGTATGGCTITriXXri^ 
GGCTGATGTTCATCGCACAGTCACAGCXX^ 

GGTACTTTCrrXXTVACCTAGTTeACCGC^ . 
CCCTCXSGACATXSACGTACTCXrA/VAC^^ 

ATCTGTXnyxnTATAAGAAAAGATATAGAAAGATTACn'AGT^^ 
15 GATAGACTCTTCXXnxSTOXX^^^ 
CGATK?rCCTGCCTCAGCCTT^^ 
TATATACGGGGTTTCACCATG1TX3CXX:AGGCTGC^ 
AGTGCrcGGATTATAGGCGTCAGCCACTGCACC^^ 
TTTGGCTAQGAAGTCATAGAAAATGCTCUTirn^^^ 

ACTACAGGCCAACGCCACGGCACCT^^ 

TCTCXSAACTCCTGACCTCAAATGATT^^ 

G CCAGG AAGATC3GTCATG-l-i-iuACTrTGA^ 

ATTTTCAGTTACTAGTTAGGTAAAGAAICAAAATTAATAAA^^ 
25 TCriTrAAATGGATAACCACAGTClTCTAAACAC^^ 

GAATTTTTOSRTGATTAGACronViTTCTT^ 

T'TCCCCATACACCACCnTGriTTT^^ 

TGAAACACGA CTTCT GTCCTCAGAAGGCAG^ 
^ TCAAGCATATGTTTGCAGTCATCACT^ 
30 GTAGAAAGATCACTGCAGATXncCTTGTCGTCn^ 

TTAAGGGAGGAATCACACXJIXnTAGCACGGGGTOSGCA^^ 

CAAAGGTC5TGTCn'ACCATTCrrATGACCATTGAAI^^ 

TTCTTTAGTGCATAAGATCTT^ 

CCCITTAGACAACATCACTGACAACrcACTAT^ 
35 TGTGCCCAGGGAAATGGGAAGTGCATTGAGGAAGAGGOXrC^^ 

TTTGCTGTGCCATGTGCyiatSGAAGGAGAGGAATGA^ 

CCOGCCCTCTCAAGCGATCCAGAAGGAGAATGAAAOX^^ 

GAAACTITOSAGGAAGQCAAAGTCAGAAATTATTCTGCA^^ 

ATTGTATGCXnTTIXXriTTCGCATAAAACAAC^ 
40 TGTITCGAAAGACTCAATAGCATATOCACATGA^^ 

ATTCTAGCTITTCCrCCAATTTTATTCAG^ 

attsgagacttaa^ataagatctgac^ 

cacaatgagtcottttgtatatcaaccxsaa;^ 

aaactictgtacagaaaggaagcaatgaacagagtgaagcagcatc^ 

45 ATGTGATAAGGGGTTAATATTCAAAATGTACACACAGCGTAAAI^^ 
ACATCTAGCAlTATTGTCnxnTGAATAAATACT^^ 
GG?CTCTGAGCAGTACATTAC^^ 

TTTTAT CATAAATAAATCAATAAATGGCAGATTATAC^^ 

rriTTAAGGGGAATTTGGTATTTTTAAC^^ 
50 . GGGACSGCCGAGGCGQGOGGATCACAAGATCGGGAGATC^ 

AAATACAAAAAGAAATTAGCTGTGCGTQGTGGCGGQCGC^^ 

CATGAACACGGGAGGGTGGAGCTTCXAGTGAGCCGAGATT^^ 

ATCip^AAAAAAAAAAAAAAAAGAAAAGAAAAAAAAGCATCAGGGT^^ 

GTTAGACTTCTTAAAGCGTATTTTGTGG^ 

2^5^^™CACAGTTGATAGAC^^ 

TTTTCTSU^ACCACACATGGCCTGGTCA^^ 

GA'IGAGGAGTTGAAGCnXSaCAGAGACTGAGAAAGGGC^^ 

TGGGAGATGTTCMX3GCAAGAAAAATTCACAAGGGA^ 

CTTTmXX:GATTCCCCATIXriC^^ 
OO GCIiTXihTX^OCA'TGATX:^^ 

<X TOGG ?ATACflGGCAGGAAG(XACTGrrAC^^ 

TCCTTTOTOGCATTGA^ 

GGC'i-iuu-lGTCTATGAAGTACACGAGGTTTCTTAATAGT^^ 
<5TTCAGTCAAT!AATTAACTGAGTTTG^ 
05 GTGATGCACGGTGATAGTQCGTATCrAACACTT^^ 

GGGGGTGAGGGGGCGGMGTCGTGGAAAGGCAGCAAG^ 
TGGCACACATGCrcTXmGT<nTT^^ 
AGTCICATAGTCTTA TOTGT CTITTGGC^^ 
GTQGGACTCATC3GAGATTTTO:GTTCTTT^^ 
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AAlXXXXXSGCGGTCSGACGGGCTTCClXyX^ 
CATCTGGTAGTTCnTTGlXX:CTGCGACTCr^^ 
TTTTAATCriCTAGCTCXrACTATCCAG^ 
AAATCGAGGCCAAGGCACTt3GAAGTCCX:ATT^^ 
CATTCCGTCTTTAATCnX3CTATAAATAGACATl^^ 
GTTCTGAGGCCTCACTGAGACTCATT^^ 
GTTrcTCCCTGTGQGAAAATACAAAACAACACAACAGC^^ 
GTXnxaCXATTCTAATGCATGAGCTCATT^^ 
CAGTACTAATTCrKnCTC?rcAC^^ 
CATTTXXTGTOGTTCCTAAGTCTC^ 
GCACAAAATACAGCCAACATTITTGCATTICTT^^ 
CCCTTCAICTATCTTTGTTCCQC^^ 
CAmTACCATGAGAAGCCATAGCy^GATGTGCCAGGTG^ 
CCTATCACTGAGGGTAGGGTAGGTAGATCACTCATTA 
15 ATTCCAAGCAGTCTTCCAGCTGGACrAGC^^ 
ATTCGAAAGCTTTTITCTATGATCAATTATTC 
CATGT^A^OlGCGC^^ 
AGTGTCTCICTTTCTCTGGAA 
AGCCAGGGAAATATCmTCACTTTGCTOGGGl^ 
20 TGGGCAGGGCAGAQGGTCAGCTGAATGGCAGGGATTCCAGGT^^ 
AAT^CCTCACTGCAATAC^ 

AeiUU-iuut3GAGCAQC7^:CTGACATGTTGTAAAAACACA^ 

ATATClTrciTTGTGTAAATGITTTG^ 

CCGGCTGCAGCAGTGTCTAGGTATX:r^^ 
25 TAA^TTXX:AA CCnTTOX^ 

TCTTTTICTTTCTTTTa^^ 

ACTGCAACCTXnxSTCrra^^ 

CATGCTTGGGTAATTTTTGTACTTTTArrAG^ 

AAGTGATCCACCCACCTCAGCCTCCCAAAC^^ 
30 AATCAGGCTCXATGGTGGGGAAGGATGTTGCGGriXXSGG^ 

GCCCATCGTCAGATTraSGTTCCTC^^ 

GCAAAAGTAAGAATTCTATCGTAGCCTCACOCAGGOT 

ACrixnCTTCC CTCCCA GQCACACATC^ 

TTTCTTTAATAOTmATTGATATATA^ 

35 gtgtattcagagttox:atccaccatcaca^ 

CAOXXAAATACCTOTICTCnxXAGTTC^^ 
CGTACCTCATATAAATAGAGTGATATCnTAGTanx:^^ 
TATCCCTCXXSACAGCATCTACCAGITCATCATT^^ 
TGTGTATCCATTCATT^CTTCA 
40 TTTCAGCATGGACTTCTCnTT^ 
GTCTTAAAAT^^ 

TGAl-A -iu-iuuCA TQCTCTACTTACAGTAT^^ 

(nXXrCATCTTTGATTACTACACAGCTTCGG^ 

TATTTGAGOCTTCTAACAACATTGCC^^ 

GTGTX?IXnXXiATrAAGAGAACCTGAGAATC^ 

TGCATGCrtSAGAGCATGS^TGQGAAGATGGAGTCT^^ 

TGAACTCAAAATAAGGTAACAGACOCTCAGGTCTACCTAT^ 

AATAATACATATG CTCAAATTOr rATATCCCCC^ 
50 T AAAA A AAAAA AAAuuu-iUU-l-ltSATCAAGQTGATTTTGATAAACT 

TCTITATTTOnxnCTAAAAGCATAATT^^ 

CACCATTAGAAATA^MTTCTTCTTAAA^ 

aiGGGGGCATOnX3GTTreACTGT0^ 

ATTTXnGATTCCTTCCAAGAAGGATTCAT^ 
55 GTTCTCAAATTTGCTAA^ 

AGTAAATCACATTITTATGCTTT^^ 
- i^TAACATTCAAATACATCTTAAA^ 

TGATITATTCTCXnATTAAGTT^ 
^ ACTCAAATACCAGGrraiATAGAAATOCATCTGl^^ 
60 TTTGGCAAGTCTAGAGGGTAAAGTATCCCTGAAa^^ 

AAAGAAATlXXXIAGGAGAAAGGTACrTTCTGAGATTC^ 

GTmxnrOCAGGGTTGCTATXXX:^ 
^ GftACTGQGCXnCGTGATGCCAGTGTCTTC^ 

D5 tcggtttatcctgaacacatgtctgtgc^ 

tcx?icatgqggcttcgaa(x:acgccacggac^ 

GCTCAGGGCTCIxnXJIXS^^ 
AAAOCAAGCC^^ 

GTGTTTTTCTATATQCTTATGTATATGCXTI^^ 



ctctaagggatattattgttcctgaatttttctgagcc^^ 
cccactcttcagaacttcaaatctaaagatctt^ 
aattctaatcctcaagoctctagtaatgcctg^ 
tgagttaacaovctoxx:aagactccgaggcagc^^ 

AAGTTaX3Grix:x:ACCAGGCCCTCT^ 

TCCXrrGAAAATCATCTTACl"lU'iUTAACGCTTCGTACCT 

TGGATACCTTAGACTCnXXrrcGGCTCTAGAAGAAGT^^ 

cctx:accaaaaaatcaggcttctggcttgctcc^^ 
cagtcctggccatgcxrictctgctctc^ 

CXlAGGAAGAGGCAACTAGTTaiAGCTCGCC^ 

GAGTGGGTGGAGAGGACTCCAGGGOCATGGAGGAAGATrcGGCC^^ 
GAGAAGGACCCGGTTGGCICTQCAAACCGCATTT^^ 
ACCATTCCCCGTCAGCCCCTACCXXSACXX^ 
CnXXnXXXTTGGCXAGCAGGGGCnGCCreG^ 

TGAAAAGACTGCACCAGAAATCGGAGGCAGCCCTCCAAACATAAC^^ 

GTXnTCT(XTGGCCAAGA<nXXAAGCGTGTTTT^ 

GGTCAGTGACTTAGCCACTCTATAAAAATGAAGAGCAGAGCT^^ 

AAIU^AAAACGAAAGAiGAGOCTGAGCATTCCTT^^ 

TCCGCOCACTCACATGTTTATGGCCXXXnxSAGACa^ 

CAGAATCCCCTAA(?rcGGAAGAGCTTiX3G^^ 

CTCGTCATnxrrATACTTGAGTTCTr^^ 

ITTITTTTTOGTACAGAGTCriTCC^^ 

TGGCTCCTUSGTTTAAGTGATTCTCCTGC^^ 

ATTmTCTATTTTITAGTAGAGACGGGATTTCAC^ 

CXrCXXrCTCAGCiKXXXAGGGTGCTGGGAT^^ 

GTTGTCATAAACCCTCGAGGAGAAGCAAGATCTI^^ 

GTGTTCTTCTATICATTCAACCAACAGAGAACAAAra 

TCriOTGTGCTGACTATATTTAAAGACAAGATTAGCAGTAATATATAA^ 

CCATAGAITGGACTTCACATTCAAAATCTAAATTG^^^ 

GATAAAAATAAAAAGTCAAAAQGAAACAAACAATGTTRCTC^ 

TACAGCTCrrGGC7VTTTClXX3GGAACAC^^ 

tttixsaaaatacacgcagagagacccactgcatgtcytatta 

tgt^cacacatatcacccxx:atactcgacattctt^^ 

ctxntogcatgcgttcttacctgttgaggatt^ 

GTGTGCTGAGCGCAGAGCCATTCrrcAGCAT^^ 

AGACAAGCTTGTCCAAYCCATCXSCCCCCGGGTCG^ 

AACITTCTTAGAACATCAQGATTTTTATGCACGGACCTT^ 



TTAGCTCACTGCATCCTCTGCCTCCCAGGTIC^^ 

CCCACCTVCCAAGCTTTGCTAATTTTTGTATITT^ 

CTCCK^TCTCAGGTGAXXXriCCTGCXri^ 

ITTITTTTTITITTTTTTTTTT^ 

GACAATTCTTCCTCCACTGTCGCCCA^ 

CAAGACGTAGCTAATTQCTCTGCTVGAACAGGGr^^ 

AGTCCXrTATTCTTAGGTTTTATTTCTCAAA^^ 

TGCrcATTTTTCATAAGTV^GTTGAAAGC^^ 

AAAGAGTTOGGAAGCAGAGCGTCAATGCCGGGATCTG^ 

CYKATTCTCSGTCGCTCGGGGGTGCCCXOTXS^^ 

ACCrAfXM.CTCTTTTCAGGAATGCAGATAT^^ 

AAAAGTAlXXriCTCTAGGCAAATAACGAGAaTlXXX^ 

CACXH'ATGTCTCrCTCXXmSCACACC^^ 

CXnCTAGAACAGGACAGGACAATCACAAAGTTATTC^ 

ATCCCAGaXrrTTGGGAGGCCAAGGCAGGC^ 

CCXHOTiriCTACTAAAAAATACARAAGTTAGCC^ 

gc^^sgaactoxritcaacccgggaggc^ 

gcxsagactcagtcicaaaaaaaaaaaaaataatataataataataat^^ 
cgo^tgtgactattagqixnxxti^^ 

accacgggaaacaqgcaaaaactactgatcaagcctt^ 
icttaataatgcagattaaattgaagcgtacagl^^ 
cxngtggctcacaaattcagcactt^^ 
caacatagtgagacctcatctctttaaaaaaaaaaaaaa;^^ 

AGTCCCAGCTTCimSGATCCCItXIAC^^ 

TCICCAAGTCTCTCTGTCTGTCTGT^^ 

TTATGAG TCACTTATGCCCTTGAAGAGTCAGT^ 

TTTraX5AGAlXX3AGTCTTOCT^ 

GAGTTCAACTGATTCTCCTCSCCTC^ 

TATITITAGTAGAGATGGAGTTICACXXriTTT^ 

CTGCCTCACCTCCawVGTGCTGGGATO^ 
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GTATCAAGCAATRCTTT^TGTAACIXXIAC^^ 
CAAATCCATGAAAGCATTGAGTGAGAATCAATTCGC^^ 
AArreTGTGCTffiC^TCCTTCGTATCACT 
TGGCAGGAGraTTGTTAAACATTTACCACyiGC^^ 
OXXAT GACCT GTGCAATCXrACXriXXr^^ 
GTGTTCnTTCCATCTGATATCACCATT^ 
'TCGCATAGTGGGCrKXrrcAGCGl^^ 
TrATCATTAGCAAATTXXTVGGTCCTCCCC^^ 
CCAARAARMSGSYTGTCATTAAT^ 
TAGTCACCATCnCITTTaXl^^ 
TGACGAAGGMTO 

GATGACATTTTO^TCCCATGCAOnT^ 
CCCCXrAAGCCROTGCCTAAGAAATGCCXAGG^ 
■ ATCIXXXAATATATITCTTCTTATI^^ 
1^ AAGTGTCAATATCrroTAGTACCaXXX;^^ 
TTTGATGGCAAATTGATT^ 
TTTTGTCAGAAGGGATCYGCTGGAATGAGACAR^ 
ATCCGTTTAOTXnAAAATACCXrAGCTAC^^ 
TCCACTTTACATATCAAAAARAAGTTGTATCAAACT^^ 
20 ACACAAGTTATOXrATCCAAGATTCTAAACTTAGC^ 
AAGATTGTIXnxa^TCAAGTATAGCACXXXnxnG^^ 
ATGGAGAAAAAACTCCXX3ATATTTAATATTATCAATTGC^^ 

CAAOTTCTrAAGAGGTTTCAGGGAAAAAT^^TACTGA^^ 

**^aacgtsttcaccmmaaaa(xx:atit^^ 

GAATCXrATGATTTAGAAAATCGAAATTTITTAAAAAAA^ 
AATAGGACTTCAGTCXXXTTACTAT^^ 
TTCGCAGGTITTCATCnTATTATGTATATGT^ 
AAGCTTGACTGGCTTGTAAAT^ 
ItXXnroXriTCCGAATGCAGOCCT^^ 

KCAGGCAGTCCX^GGCAGGAAGGCOCArrAGTXSAAAAGAGCrCT^ 
CATITAGGAGATAmCTQGAOT 
TAATGAAOVTGTGXXriTOSGAGATTCCAAC^^ 
GATAGCATTAAAACAAATGCCTCAAATOIArrTTT^^ 
3D ACTICAAAGATGCTAAATTATCTCTr^ 

TTXnCAAGAGAGGGTAaXSGGCAACATTTCATAAAAGA^ 

^CACTGCCCTGGT^^ 

TTTGCTITAfiCTT^^ 

AAGlTAACCrriTOnXSAAGGAATGATCCAS^ 
4U TCCCACTGATACTTACAAATATACAGAGCnXS^^ 
(nTATCAGAGCCTITGTATCTAAATCSTCT^ 
CCACTTACCATCCACCTGCATACTITACACIWT^^ 
TAAATTTATATTAO^GTrCCCACTCCCC^ 

43 TCTTTATATAGTATT TATIG AGAAACCTATTCT^ 
AAGCATTTAGAAAGACTTIXnCTATAGAGCSCI^ 
TAGTCTATCOTATAGAAGAaTCTAAAAAGAGCAGATm 
TAATTAGATAATTATCHATAATAAGCACATTAATTATC?^^ 
AAACAGCTTGTGAGITCACTATTAGGTC^^ 
3U ATAGAlACATTTTCX:ATCnxnxX3AGGTAAAAGCA 

aSGCCCTGGTACACTTAGATAGGCATCATTGCC^^ 
TGTACCAGATTTCCTTACCTGTaXAG^ 
AAAAGGCnrcCTVATITATTATTAAGTGGAATAAGCAGTTTW 
ACACAGGCATAGAAAATAGATATACATAAATAAOXm'ACTATAATTO^^ 
:>:> ATTATCTATTATGrrccCAC^^ 

GATCMOCCATGGTCTGlxmTTGAGAATCACA^ 
AOSCAGGTGGATCACCIXSAAGTCANGGACATC^^ 
ATACAAAAAAAAAATTAGCTCAACCTGGTGGTTT^^ 
CITGAACCCGGGAGGCGGAGGTTACAGTCAACC^^ 
OU GTCrrCAGAAACAGAAAATAAAACAAACGAAGAGATAGTAT^^ 
CTAATGAGAAAAAAATAGATCTCXnxnXTTGGAGGGAAT^^ 
T^TACITAACTAACACTTAAGGAATTAAATQGCATATC^ 
TTTTTATTTATTTCT^^ 

GTnnCTATGAACGTTTTTAGAAACGTCCTGGT^^ 
GATOCTAGGTCAGATQGAAAGCTTCCCnC?^^ 

AACTTAGCGGAGGGTACAGGGTCnTCTOX:A^ 
GCArmriGGCAAGCTAGAAAGACAOCTGA^ 
ATTTGTCICTCAGTCAATCACTIt^AAT^^ 
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ATXW^TAGCAAGAAAAATAGTTCTTX^GCGTGCC^ 

CACAGCTGACAGCGCCCCGGGAGTCCTO^ 

AGGGAGGGTTTGCTCTriXSGCTTTTACAAW 

CT^CAAACATACCITCAGGTCAACAGTTAAAGAGAGAACCC^^ 

TTXnxnTATTTATGGTAGCTAGATTTCTATTCATTT^ 

AGAAAATTGACAlXiATCTCriTCTU^TATC^^ 

TTGGCCTCTTTCTACAATTTTXSGT^^ 

CCCTTCACACGTGCCrGCAGGGTATATACTTT^^ 

AACGCGAGGGTTTCCTAATGClCeiTCCATT^ 

GAGAGGTGTTAATGGACACATCATTGAGGCTGAAGCCAAA^^ 

ATCAAGATTAACATrATrATTATTGGTGTTACCI^ 

ATTTTTCCTCCIXXrrrcAAAATAGTG^ 

TAAAGTTTTATGAAAAAGTCTICCATGTCTTTT^ 

CTGAGAATACTAAAATCITACy^GTQCATTlTA 

AlCTATTATTAAGTTACCi'l'inCCIX5AAACAAAATATl^ 

ctacsgaaotgtcacacacacccacccacccac^^ 

agaaaagtaatagaatgagcaattgaaaatcacx:agaactgaaat^ 

tcgacgcttcgcatatgtatitatggctcnx^^ 

caacaggggatacaaatggatttgaattctcn'aac^^ 

gccnxxxxiacatctctcaaaagtcctct^^ 

gggcatcxxsttcacricatgttgggacgggt^^ 

ggtitatttcxrrtctcgattaagagtgatg^ 

ctatttttttttttttitttttt^^ 

ttitctttactagtgagtcgcagatatgti^^ 

TTCCAGCCCCCACCTTCCTATGrrciT^^ 

ACGGCAGTATGrnXriXnCCT^GTATTXTCATAA^ 

GTTCGGGCTTAATATGTGGTAGTTGTTGCTGTTl^^ 

AGATGGAGTCTTXXriCTXSTCACXXIAGGCTG^ 

AAOrcATTCTCCrrGCCICAGCCTCCC^^ 

GGATTTCACCATGTTXSGCCAGGriTCGTCGTGAAC^^ 

GATTACAGACGTGAGTGACCACGCCTGGCCCATAAATGTTATC^^ 

TTGTCCATTTCTACACAATGGy^CTTATAATTTTAGC^ 

TAAGAACrrTTTTAATTGAATGCXXSATTTGTTC^ 

GAAGGTCCXAATGGCACCAGAACAAGGCCAGCTIXCT 

GGTAGAGAAATGGATTGCCnxnXSATTTCACTCT^^ 

CTTTOGTACTCACGGGTGTTATGGTTCATAGCT^^ 

CTTCTTTACTGTCrrcTGAGCTGATrTC^^ 

TTAATCTTRTAGATGAGACAGCTATAGATGGATCSTATAGTCT^ 

GGCTTTTGTGCAGGTGACTTGTGGCCCCCGGC^^ 

CCCAGCTCIX5AAGATTAAAAGGGTTTTCnX3T^^ 

GCICCGATTOCTTTCCTGAAGGCAGTAAGCATGTT^^ 

TGGCTTAG^-rieiuCCTCrrACCCTCTTCC^ 

TOrrTCTAATTTTTTTITCTTCAT^ 

TACACAGATGCATATTTAGGGIGTGCTATGGTTAGAAAGT^ 

CAAGGTCGTGACGGTACTACAAAGCAGGACCTITAGGACAra 

TGGGTTTAGTGCCTTTAtAAAACAGACCCAAGAGAGA^^ 

ATCTATGAGCCAGAGTCGACCCTTACCCMC^^ 

GACS^TAAATTTGTTACTCATAAGCCAfXrrA;^^ 

GATTCTTTTTTCAGTAGTTrCACCTA^ 

ACACTGAGAACATACTCnx^ACCATCXAGTCATl^^ 

tctcoctccacixxx:atcxxx3gccac^ 

CTATGGGATGAGTTCCCAGQCATAGGATraXAGGC^^ 

taoggtgat^gagccaaagtcttgatagaaacactcaacgc^ 

gatacgcacattagaaggagtaatttgaccagccx:atcac^ 

gttitcgaggctggaaaggcaggtggaaagcrcatcag^ 

ggatgtcaggattgtclxxnx3ggaaggcix3^^ 

GGCXXnxSGCTGATTAATAATCnTTGTTATGAACn^ 

CTCTACCACRGCCAATGATTCACCCTCC^ 

TTCGTGGCCTCYGGAAGTOCAGTTTTTC^^ 

TTTTGOCATTXmTITAACAAGCAAAAGAAGAO^^ 

CA<XAGGGCX3AGCAGTTATGGGTGAAGGGCTGAAT^^ 

CT(XCTOAGCCTTGTAGAATTGAGTTCATTC^^ 

CAGCTGTCTTATATAACITAATrcTrcATTra 



GCCTCAAGTGATCCrCCXIACCTC^ 

TTTTTAATOGGGGTCnCTSGGTCXXSTOC^^ 

AGAGCnCAAAGACAAGGCTCAACmtlATATTCCAACG^^ 
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AOCTITCCTCC^^ 

GATGGTTACXriTICTAAGTGTCTTTTTATITC^ 
ACACTTOAACACATAACronTirmr^^ 

TCGCTACTGGAGATOSGGACA^ 

GCCTCTCC^ 

GGAGTreAAAAATAAOXnTAGG^ 

GAfiCAlTAAAGCACAACAAAAACCTAATOrmxa^^ 
<n-IClX3GAAGCTCia7I^^ 
CGATCnCACTCriTCTCTGTCT^^ 
AAGACAOCTACACTCGGAAATKX:^^ 

AGClCTGATGAGACITCGGATGGTAAA(m:A^^ 

GGCTCACGOCnCTAATCACGGCACT^^ 

TCAACATOn-AAAATCXXCnCTCrACTAAAAATACA^ 

CTQGGAGGCTGAGOirAGGAGAATO^^ 

<nxnX3QC31X3«:A(aGTGAAACC^ 

ACCATCTTCATTTAAAGGCACCTATCTAAAOCAAAGGAan^ 
A^aTM^^TlCAAGCTAAGGTAGAC^ 
TCOXnCTTXnSVTTCAAATCGCnCTATAAGAGAGG^ 
TOj^TGTTnXXSGAGCTGAAGGTX^^ 

GGTTTCAGCACATCAO.^ 

GATTTATGAGAGCATGACOCTGGGGGCCTGTGGAACAGCT^^ 
GGCOCATA(n«:AGCTAGGAGCTAGA(mTrAGC»T^ 

AGGACTCTGTITCGTGTGGATCACATGGGAGCTnC^ 
TCAAAIAA^^ 

TGTGACTrAATATTAGAAGCAGACACAGCCATGATGAGCAATACT^^ 
CACITTCTT^^ 

TATXTATTTATOTATTrATTIXnTIXnTTGr^^ 

TGATCTC^^ 

GITCTQCACC^C^^ 

ACTCCTCACCTa?TCATCCACOCGCCT^^ 

ATTTTCAT^ 

TCTCTCCAAGTGCl^ 

^TCATCCCATICTCCTCXXn^ 
A^-AGTAGAGACXSGGGTTTCACXaVTG^ 

TOCCAAACntXnXXXJAlTAC^ , 

^^^^^^'''''^'^'^''^^''"^^^ 
TGCTQGGCTGGACAGAGCAT^ 

GmAAAAAAAAAAAAAAAGGGCAAGGGGAGOTHGA^^ 
ATCATMAAGCATGAirrATGATTTGA 
■K^ACAaXaGTAAGCAGCTCTGGCTnTOT^ 
CACCaGCXXnOGrTOGTCaGTCAACAGATR-^^ 



X iv^x i-iV-B^^-l»lJl«JAGCTAGGGAGTAGGGGGaO^GAAGCCa:Aa 
CACTCXTTAAITGCI^ 

TTCCAAGAGGAAAlCT«nGCTGAG^^ 
^T^AATC^CA^^O^ 

AACAGGTCTCCXXXAGA^ 

CAGTCACCAGOCCCAAACCCTCT«?ICTTX?^ I_ir!ri2zr:* * * ^^^^ 



AATCCCTGCTACACAGGAAGCreAGGCAGGAGAATC^^ 

ATTGCACTCCAACCTGQGCAAAAAGAGCGGGGTAAAAAAACAAC^ 

CACACAACCTAAGGACAACITTGAATCAATTXr^^ 

TC' i T i uuuuvi' i 'ir i ' r i u ' rin ' ri ' rriUM uxx^AGACAGAgic^ 

CAGCTCACTGCCACCTCTGCCI^^ 

CACCATCACA(XTC3GCTAA7TTITGTATITI^^ 

GACTTCAAGTCATGAGACTGCXri^ 

atctcitaatgagtttaactacatttaacx:atgtatgtgt^ 

ataatgtatoitgctacatttaatctttatastag^ 

ggagtgatattgcxrixxxlagggaatgcatgtcaatgt^^ 

gggggaaaaggtgctactggcatctagtgggcaaaggcxmgg^ 

caccacaaaatatcagcagtgccngaggttcggaagc^^ 

CGTAATXrrCGGCrrcACnXXrAAGTTCTGC^^ 

CAGGTQCCTCXX:ACCACACCTGGCrAAl*i*r^ 

TGTrAGCX:AGGATQGTCICGGTCTGCTGATC^ 

GAGCCACCACACCCGCCCTACATGGGTATTCTAAC^ 

GCAAAGAATTAACAQCAGAAGATCXSCTCT^ 

TTTKnTGCCAGTAAAAACATGAGTCAGATTTCATGA 

AAAGTAGTGGTAAGTCGAGGTAAGAGAATCTTTT^^ 

ACCCAriTAOTCTCAGGCTrAATGCTTTTK^^ 

TTGGCAGlTtnAAlTrATTAATTGATTTITAAATTT^^ 

aaggcctatkxmtttcagtggctggltctt^ 
ncctcagccnatcttggnattccx:ac^^ 

gcx>3igagtacagtggcacx3atcttg^ 

CXriX^GTCiCTGGGACTACAGGTGCACACrACCATGCCX^ 

GTATTGCCCAGGCrGGTCTTGAACTCCTGC^^ 

ATGAGCTACTGTACCCGACCAAGAACmCrATrATATAGAT^^ 

AAATGCATTCTCACATAACCTCATAGAAGTGAGCCCrCTTAC^ 

TATITTCTGGTTGGTGGTTTTCTTTCTGTC^^ 

TTTATTTTTTGAGATGGAGTCTTGCACTGTGT^^ 

CCTCTCGAGTTCAAGCGATTCTTCTGCCTCAGC^^ 

TTTTTTTTTATTTTTAGTAGAGATGGGGTI^^ 

ACCTCGGCCrcCCAAAGTGCTGGGATTATAGGTGmX^^ 

TTAAATACCTAGGATTATAAAGGGAGCCAGCTATACTTACATATAGT^^ 

AAAAATAACAGGTAATGCTAATCATAAAGTTTTACAAATCTCCT^ 

AAATITCCCTGACCCCTITXXIAGGCAGGAAGTGGAG^ 

TCAACTGCACTCTCTTCAACCCCCTGCAC^^ 

TAAATQCTTTTGGGCACTGGCAGGAATGAACT^ 

GAAACCCTAAAAGGAGTGTTACAGTCAGTGCTCTTTTAGTI^^ 

GAGGGTCAGCGTGACAGCCTTTTGCACCC^^ 

ATITTAAGGATGGCAAATGTGGGGGATTTTATTGe^^ 

AAGNCAGGAAGGTAATTTTCCCCTGGAGCTATGCCATCAAGCT^ 

TCTCTAATTOXrAGCTQCTTCTCCICTT^^ 

CAGGCCXXSGCXIATGGGTGjamrai^^ 

TTGATACCTGCAATTCriTTGTACATATGA^ 

TAATAATTTTTAACATTGATTGAGTCnX5GATAATAAT^^ 

ATTrcACAACTCCATAAAGTAGTTTIOyCrrrAGri^^ 

AATATOGAATTCGAATTICAAAGCAATCTATGGl^^ 

ACTTTTTGATTTGGTTTGGTTCTTTGG^^ 

GTTCTCACyATCTAGAAATTAAGCAGAAAACAGAGGTt^^ 

TGCrKX3«3AGTCTTTCTTGAGGGAC^^ 

ATTTCCTTTTTGTrTTTCCTTAa 

AATTCmsCACCCTGrmGGTTGAAACAATTG^ 

CATACCAAAAATTATCTTTTGTATnrtaKrrACCAGT^^ 

TAGAANTCAGATGAAAATAAATOGGGCTOGGCGT^ 

TGAATCACTTCAGGTCAGGAGTrCGAGACTAGC^^ 

TTTICTATTITCGGATTCGGTTTAGGTCATtS^^ 

TGAACTCCAGAGGCAGAGGTTGTAGTGAGCCAAACr^^ 

CTCAAAATAAACAAACAAATAAATAAATAAAAATTGrrTTAAACCT^ 

TTTCCCACCTTC(XAAAAGGGTGGGAAATTTT^^ 

TTAGAGACAQGATCTGTAGGTCAGAAGTGGTATTTGGGGGAGTAGG^ 

TTCXXSAGTGGAAGAGCTGTCAGATTCTGAATTTA^ 

TCTCrrTATAAAATCATGTTTTOTATTGAATTAC^ 

TOTATTnTGAANGGAGTITCGGTTKSGl^ 

CTGTCTCTTOACTTCI«rrT^^ 

GADCAOTAGTAAGITCCTGCCTAGGCTCTC^^ 



20 



30 



ATACTAGCTAGGGTGTTCATAAGAAACAGCAGQCATAGTTAC^^ 
OCTVriu-lCATCXnACTGATAAACXSCATACCCAAT 

TGTOTAGCXSAAACTCTCCXTrATGATACXXST^^ 
5 . CIXXXriCCATGATTCAATTACClC 

. CAtKXXyWWXATATCTkAAAACCATTT^^ 

GAAGCAAGmGTAGGANC3GGAACANGATGemK^ 
AGAGGTCCTGATGACXrrcGGAGGGTGGGAGCCA^ 
CCTCXCAGTXXXrrCTCAATCGCTAAACC^ 
lU CAGTTQGGAACAGAGTTGACGGAGAGGGGACAGGAGGTC^ 
TXSGTGGCriCATXXXriKnT^TC^ 
GGCCACATTOSftA GAAAAAG AATKnxnTTGGC^^ 
AAAAAATCTCATAM^^ 

CACCAGCCATGGTTTGGACAAGCTTAGTCTAC^ 
1 D GATCACAAQG GACAA GTAGAAAATCCTCKSQCTGGCC^ 
TCTGCACCCAriTTCCCCAGACCT^^ 
CCCCTGATGrrTCAGGGATGAAGTCTAGCCX^ 
CATACTTGCTCATTCmTGTTTCX^^ 
GTTCGATGCCXAGTAAOCAGTACAGTGCCAGA 
TGCCAWVACTAGAGGATGAATGAGTCATGGGTAAG^ 
TOGAAAATAAATACTAGAATCATTGAATCCCTGA^ 
TGTXXSACATCCnXXSATGTATTAGCTA 

AGGAAAACAATAlxnTCCTGAGAATGTTCTGAAGAATATA^ 
^ 'I'ATAGCTGCnCAAGAAATATTAGGCATT^^ 
A^^AACCaACTGAAAACCATTGAGAGGCT^ 
TCAGACGTCCACAGCATXTTCATITCTGCC^ 
GAAGGTATGACAAACATTCTGGAAATCTTO^ 
CCTTATTCTAAATAATCACITT^ 
TTTTATTITATrTTATTITTTT^^ 
ACCTCCACCTCCC^GTTCAAG^ 
TGCCTGGCTAATTTTTOKTTATITrAGTAGA 
TGATClxmCGIXrrCAGCCTCCCAA^ 
AAGACTATACACTGATATCCCOSAGCCCCCAAGTT^^ 
TCrnCCTCTTXXAGTGTT^^ 
3^ <^GATGGACACA<XSATCTGCTCAQGGAGATAAGATCAC^^ 
GAlX3GGGACCAGAGCTAGGTTAGGTCriXriX3G^ 
GCTGCTTTATCTTCTCTCXIATC^^ 
ATTOnCTAAAGAG?UlGGCAAGAGAGGAATGCAGTGTCp^ 
^ CTTCAAGTAGAGGGAAAGQGGACGACAGAAGAACCCTTAGAA^ 
40 AATATTTITTTGAAAAGAAGGCCTCCCTTG^ 
CCCTGTAATCATTAATCTCGTCTTCTT^ 
AACACTGTTCCCGTTCGCCATGGGAT^^ 
CCAGTGCTGCTCAGAGGTATGACCTAGAGCAAGGGG^ 
TAATTTATTCACAGTGGACCXSTGAGGTT^^ 
43 AGGTTAGTTTCGGGTATTTCCAGCCTTAGTCTT^^ 
TTATAAATTGCTTOCCATGGGCXXnTCn^^ 

atacaqgk:accaca acataa aoca(Xccx:agrgta^ 

CrCACCAAAATGAAGTTTTTAGTCnA^ 
AAATCAAGGCATGAGGACTOTATCCCIXXTG^^ 
CAAGATAATGTAATCTG^^ 
GGGTTCTCriXrn^GCmAGGT^^ 
CCACCCTTACATATCXXAGICTATAT^ 
- CAATGCAATCCCATQCAAACAAAGCTGAGCATCX^^ 
^ ATGACIATGAAAAGGGAAAATGGAGGAGAA^ 
:>:> TGGATTATTTCTCTriTlTAAlTTCXA 

ATTGAGTACTTAGTATGTGCTQGTOX:^^ 

ATTATCAQ GCCTC AACGTCCTGAGAAQCTQGGACT 

AAGTGCnxXXSATTACAGGGCTIXSACXX^ 

TCTAGGTAGATAGATGMGGGITAGGAAGATCSTAAC^^ 
CACTGACTATAGATCTTGAGCCCATTACI^^ 
<=^'I^'ITATCATATGAmACCTCl^^ 
"-^ ™ayvCTGGGAAOCAATTCATClXn^^ 

TTOSGTGATGAGGAAGCTGGATTTGGCAGC^^ 

CCGCC3GTCXX:AGTGAGAGCTGCTTTT^^ 
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AAATGAAATTCACCTTTCCXXGATTCTTGAAAACGGAA^ 

TTTTTTCCAGCATCGTATTTTGAGCGGCACAGA^^ 

TACCTGAAAGNAAAATTTA^KnGATTGm:AGTCCAT^^^ 

ATTAA^KXr^A^I^m«:ACTCACACCAA;^^ 

TCACATTATITTCGATGTNCa^^ 

TACATllXrApSTTTGTTTNAATAT^^ 

GGTCTATGCACTTCGGCCACGGGCTITCCCAC^^ 

^CTGTAGCCTGCCCTCCTCITTAACTGAA^^ 

TGAGAGAATTGCTCAAGCTGCAGTTTATCATTATAT^^ 

ATTTTTCATAACCXnTTACTAGTAATAAATTTGATTGTTAW 

Tl^TAAAACxxrrcA^^T^AGGAGCATTTTlram^^^ 

TCTCACCTGATATTQGAGGATTGTTTATTTCTC^^ 
CTTOITTATATGTTTCriTCTGTKX^ 
CTTTCACAGGTGTWQGGANTGNAATGGGGAAAGAAGT^^ 
IXSGCAGCAGTCACNAATTaTGTGTlXXXn'AATA 

cctttaatgatcttttattgaattttggt^^ 

gtgttcaactgaactatggtcsgttaggttagtaattagat^^ 

actggtaaaggagtcttgtgaggatraattgaaataact^^ 

ttgtcataggagccattgttgtagccttagaaaacat;^ 

agtqcatttctcttccagtaagcaattctg^ 

ttgttgacagaagtaggacttcaggtccacatatat^^ 

ClCTTCCACGlxnX3GACAGATTGGCACAT^ 

aX3CACXnTGCCAGAACATGTTTrcariCT 

AGATAATCTAACA<XAGCAATGATGCXXAGAGCATTTAT^^ 

TACTGTACCTCTATTGGGAAATTTACTCICT^^ 

ATTrAAATACAGGCXrrTTax:ATTTCACAGT^^ 

TCTCTCTCTCTCTCAAACACACAC^ 

ACATTCTTTTTCCACCCTCACACTTTTT^ 

TATTTAGGTATAGTTAGAAAAAGCAAAAimKSGTAACAATAGAATCTT^^ 

ACAGAAAAAAAAAAAAAGGGAATCAAGAAhKXXlAGl^STGCTTT^ 

TTAAAATCACCATCTTGATTCrrGAGCTTreGT^^ 

TTGCGTGTGGTAGTCCAGCACATTTGCTGATTC^^ 

CATTGATGTGGGGTAAAATGAGAAAAGGATTAGTITTATITTC^^ 

TAATGCTGATATTTCCCGTGGGTCTCTGACTCTGA^ 

GAGCICTGAGTTTercTGGATTTCTATAGGCa^^ 

TCTCATTCTGTTGCCXXX3GCTGCAGTGC^^^ 

CTTCCCACCTCAGarrGAGTAGATGGGaCT^ 

GGTAGAGCXrrCACTTTGCTGCCTAAGCTGATCTAAAA 

CTGGGACCAATAGAGATTTCTCTGAGAATTAGGlCrc 

GTGCXSTAGATACAATTAANTTTAGAATTITTAAAAGGCTT^^ 

ACATTTATITTIWVACGTGAAAAGCAATITAGAAAATTGCACAT^ 

GTACXSGOrGGGCGCGGTGGCrrcACGCCTGTAATC^^ 

TCGAGACCATCCCGGCTAAAACGGTGAAACCCCGTCr^ 

CIxn'AGGCCCAACTACrTGGGAGGCTGAGQCAG^ 

CCGCCACTGTACTCCAGCCrGGGCGAC^^ 

TACACAGACAAATGTTCTGACAGAAATACACTGA 

ICTTTATAGTTAAGTOA^flGAACTTGGCCAGAC^ 

TGACATTGAAAAAGTX5ATTTAGa:TTTGT^^ 

CCCAGGGTTATATGTCAATOGGTTAGGCACAGACCT^ 

GTTrCTGGTTTTCTGATTTTTGTC^ 

ATTTGTTTTGrrTGTAAGAATCAGITGAGGAC^^ 

AGAGGGTTTCTTTCAACAGACnSlTCCC^^ 

TAGGACCTGGACTTAATCriTCATGATGTTT^ 

ACATCTCAGGATTTGAAATCm'AAGCTAAaVAAlT^^ 

GATGaUWXTTACGTAATCATCTCCTGGTGC^ 

GACATGAaxnTWXXTCTXnrCKXAA 

AAAGACAGAGTGGTGTTGCAGGGACCTGCAGTCAGI^^ 

TGTTTlTGTTrTTTAG^AGTTAAAGCrrcAG^ 

AGCCGATCTGCAGAGGTGACCGTGITCGCTGGAAT^^ 

TGCXSTAAlXXACXXAGCAGAGTTCnXXri^^ 

CTATCAG(XXnCTAAACT503CATAACCTTGAAT^ 

CCAAGTCCTTTCGTTTCCTTTCTTlt^^ 

GGCCTGGGGGTTGTCACAATGTGATCXXTGAGGAT^ 

GAGGAGGAGAGTAGAA^KXMAATAGCAAACAATTTTTCXr^^ 

GGGATGTCGTTGGGGAAGGGCATAGTACACATAaTrGAGCAGAGTAC^ 

AGTGTGGGCTGGCATGTTCTTCIXXACAGl^^ 

TGGCACAGATGGCCTTCnTGATCrrCXIAGlTAGa 

CGTTKXrTCTTGGCTOCCTACTGGCTAGGC^^ 

TAGAAGAAAACCAGAGGTCKnGCACACTAT^ 



10 



ATaaVCXrTTACACAGAAGTTTOOXXOTAACATGGa^^ 
(XXy^TACCATATQCTGAAAGACTTGCAATGCAAAAACCTAAGGCC^ 
OSGAGGCATCAAGACTATAAGATCCAAGACCTTAACTTT^^ 
AGAAAOCTGTGAGATC(XTrGAATATACTtaCAATACACTGAT^ 
GCCTGATAGGTtXXXTCTCTTCCT^ 
TCAATAAACTAACATCTCAACAGCCTCXXATAAATT^^ 
AGAATTICTAGATAGACTTACATCCCTCAGTTTCAAAGCTT^ 
CTAGCCAAGCAGAAACCCATCmTITGACTTTCACX^^ 
CATAATCTOTAATATlOiGTGAATAAAAAATlty^GTTTT^ 
ATATrcrCTCTCACTATAATTGTAAAACAGTAGATGTC^^ 
AACTCOTAAATAATACAATCATACn-AACAATlTATrcAGAGATTT^^ 
CrAAAACCTTTATTAAGTAGACATAATCTTCAlXnTATAA^^ 
<^GTTAAATCAATG(XAAAGTX^ 
1^ GCCATTTTTAAACTCATTGCAATGOICA^ 

ACACCTCTAATTCCAGCACTrroAGAGGATAACTTa 

CTATCTCTACAAAAAATTTAAAAATTAGCTGGGTATGGTTXnX^^ 

GGGGGATCACTICAGCCCANGAAGTrcAAGGI^^ 

GAGTt»GACTAOTCTC»AAAAAGAAAAAATCAGATTATGG(XAGGGGTCGa^^ 
GAGGCCAAGGTGGGAACATCATTTGAG(XCAGGAGTTAAAGACa^^ 
GAGAGAGAGAGAAAGAAAGAGAGAGAGATTATTICTAGAATOAAGCAAAC^ 
TAGTt5ArrCTCrAAGTACATAC(XaCATICTGATT^^ 
^"^ACCACnn^XXAQGGAGTGACTQCAAQGGTT^^ 
GCOCCAACTCAGGGATGTTGTATT^^ 
23 OCTCGTCGGAmTATTGAGAOGGTTOCTTlT'AATCCA 

TTATGTOXXX^VGGTTGGCCTCAAAGOXXTCGGATGAAGAGC^^ 
CTGCCTGCAACCATGICT^ 

gagaccnacagtgatccxcaggaatagtx:atggttqgcgattaaanta^ 

TGTICATTIXXTATGGCTAACATCACAAAATACX:Aa^ 
JO TGGAGGCTGGAAATTCAAGACAAAGtmXXTGGCAGGGTCTC^^ 
CTTGCTCTGIOriCACGTGGCCTTCCGT^^ 
qilAGTCCTCTTtXSATTAGGGC^CCACGCITAT^ 

CAGTCACATQGAGGGTTAGAACTTCAGGATATAAATTTGAGCaJG^^ 
<^CAAGGAATGriTATGGTGGATAAGGGGAAAATGGTCCCTACXX^^ 
J3 TAAAGAAATTCACATAATTAATCACAAAAAGCTAATACTATCACAAAGAATO^ 
GACTCTAACCTTCTGTAGGAGCTAGAGAAGGCTICrcTCA 

GGGGTCAGGCAGGAGAAGGCTIGGGACCAGGAOGTAGAGAAAACAGTAGTXnxyvTC^ 
TACAACAGGTTICGGGACTreAAAaxaVGCCavCATTGAGT^ 
ftCTGGACCCAGAGTTTCATCACTTTGTTAA^ 
40 TTACGSOTCCATTGriTTATATr^^ 

TATirrTAAAATATAAATTCCTAGTCCCaCTCCCTAAAGAT^^ 
GTGTTTAAGAAGTGCATCAGGTICITCTGCTACn^^ 

AGTrACTG?OH>GlXm-ACCAGAGAGAGCCTAGAAAAOCTCXXXaiC^^ 

43 TCAGTCTKXXaACATGGraAACCC^^ 

°°'^'^^*^^^^^*CTTGGGAGGCTGAGGT^^ 
CACnXX:ACTCCAGCCTQQpAGAGTXn'GAGACCTI^^ 
GTCTGCCACTTTCAACCCACXSGGGATAGGGTCBG^ 
"ITroTGTACATCAQCTCTTICAAATrAATA^ 

3U ■^^TCAGGGACflGAGAGGTTAGATGTGTATCTAAGGTCaCATAQCTCAC^^ . 

TITCAATGfiC(X:TCnGGTGAAGTGTGAri^^ 

GTGATTCTACTICTCATrCATTAGGCCATCCAGAAAGCXnT^^ 

AAACTAGGTCGCTTATAAACACCTGAAATCTArr^^ 
<-<; <^™^^«3TGTTTGGCAACX3GCCX:ACTTTC^^ 

55 aaagogtctctctatctggcx:tctito 

AAGGCCCTA^KXrrcCTAATATAT^ATATTOGTGATTAAATTI^^ 
CCAGCACTl^^ 

CXJTCTTITCTAAAAACATITrAAAAATTAGCCGGGTGTGGT^^ 
CAGGAGAATCACTTCAACOAGGAGGTQGAGGTTGOVGT^^ 
^^'^^'^^CrKmnCAAAAAAAAAAAAAAAAAAAAAAAAATTCAA^ 
Mn-ACTCATCTQGATATGATTTGAGCTAAGACT^^ 

TCCCTGWVITTAGGGGGAGACAT^^ 
^GCCTCATTGGATCAATCATATT^ 
GTOGACTCTAAAT^ 

TAmWTOAATTTKXSAGATGAAA™ 

ANCCraVTTAACATCAAGGCTTftAlTGTAAGTrAATGT^^ 
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GTTTTGIKXrCXXAAATATAGCCCCAAACrcGCATTT^^ 

ITAAANCATCATA^KrmTATCGGTCSCCT^ 

ATAAATCTCAGCTCTCrAGAGTGTCACTGTGGTAT^ 

CGGTCGTAAAGGAGGGACAATAGAACCTCTGTAGAATCAAAGGCAGAA 

GCCACGAATAGATTCATTCTTCACGTtrrCAAATATA^ 

TGAATClXXriCTAAGCnXTTCOTAATATATT^ 

TATATTATTCATCGTAATGTAATTCCCCXrTTTAATC^^ 

AGTTGGATCGACCATTAATCATGCTATCTCGACI^^ 

GCAGCAAGACATGGGGCCCACTCCAGACXJATTGAACX^ 

GCTAATCXSAAATCATTCTGGGCAAGGAAACAAAA 

ATTCTGAACGGCITAATCXrrACAGTTTAGATAT^^ 

ATACCTGTAACAAACATriCTATGGTGTATGTAGGGCAGGAAC^ 

TTAGCCAGATAAAATAAACAAGGACAGAGTTGACTGAGAAAAAAGAAAGA 

AAAATTCCCATCXXACCCXXGCAACAATGAAGGGAAAT^ 

TGGTTGAATGAACnTACCTGGTAGGGATTTAATOT 

TCTTTAACATTCTCTTCTTCACTAAACCXT^ 

ATTTATAtnCAGAAGAACAAAAGCAACCXlAATTATTT^ 

GCAAACTTTGCAAAGGGGTCACAATGATAAATATTTr^^ 

ATCATCX:ATTAGTGAAAGCACATAGATAATCn^ATAAA(XAATGGa 

ACAGGTGGTAGACTGGATrTAGCTTCnXSGGTT^ 

AATTTCAAAAATCATTTlXrrTITTTGA 

TGAATTTTCATTTTIXmrPGACATAAC^^ 

TCTTACTCCTGGGCTGlXXrrTGGGGATGGCAG^ 

TGATACGCTTTTCCATCCTTCAATAACrixa^^ 

CGGACCTAATTTriTTTTTAAGGATTAAGAAAAAAGG 

CATTAATTGATAAATCCACTGTAAGGAATGCATGACACTGTC^^ 

CAGTCAATATATTCAGTGAAACAAATACXrrAATATATTCCATGAAACAA 

CATCAATATATTGTGTGAAACAAATATCGIX^TATATO 

ATGAATTTATTCCATCAAACAAATATCGTGAATATATTCCAT^^ 

AGAAAOrrAATCAAAACAGCATGCTCCTATATCnT^^ 

AGCATTGTCrrcAATrTTCCX:AA(^^ 

TTGACCCAGTGTTGTAAGGAAATGGGAGAATATTCXrCTGATC 

TTGGAATGTTGGAGGGAGGCTGGGCTAAGGTGTICTGT^ 

GGATTCTGTCTGGTAGCTGGTTGCTCTTCGCGCC^ 

TGCCACTGCX:CCCGCXXCACCTGCCCCCX3CCC^^ 

GCCTCCCATIOUVGCATCCTGNAAAAAGCAGANC^^ 

AGGGACATGATTGCTCTTCATTTTACGCrCX^^ 

GTGCCXrAAGCANGGTGlNGGTTCTCr^ 

AGGGAGACTTTGAGACCTGGGGAGGGCICTCT^ 

TTCTTGGCTTCTTATAACACTTCCTTTC^^ 

GCCTGATGAC(X7^GGGACCCAGGCAGTCCT^ 

ACTTCATTAACXrCCATTTCCCCICAAAAGATGATAC^^ 

ATCTGATTTACAGAAAGCCATCTGCTTCCATAC^^ 

CTCX:ATAGCATCTGAATCTrAATGAATAAAAGCACTGAC^ 

TACAGTTTTCTAATCXrAGCTCCATGCTGGATATTT^ 

GTGAGGGAGCGATGAGTAGAGGAQCAGCTGGAC3^GGTC^ 

CCGCATTCAGAACGAGAjCStriCATAOGGCAlCC^ 

TCTCNCATCTCTCnTAGriTCTTTC^^ 

CTXrATTCNTTIAlTCanTCTltJC^ 

TTTATTCnTATTGTGGTATOnT^GGCCAAG^ 

TCCATGGGGACTOXACNATGCCACAGGGAGTOIACC^ 

GAGAACTGTCAAGAGTTITCNAGCTCGAAGCTATO^ 

GGTGTICACTICTAAGTGGGAGCTGAACAATGNAG^ 

GAGGAGTGGGGTTGGGGGAGGGAGAGCAOXIT^GGAAGAATAGCTAATGGAT^^ 

TAGGTQCAGCAAACCACXATGGCACAGGriTACXrrAlCT 

TTTTTTTAAAAAATCnTTTTATTGTCAlTT^^ 

CTTOXnCATTTCCATQGGCTCTGGGAC^^ 

GTGGCnOTGGCXnX3GAC?riCGAGCCCAATAA^ 

gtggttaccagtgag(nx3tcractatc^ 

gggaat3^taagtgagcnagtatggc<2^ccctaat^^ 

catgatagagaagttttcxctagaaggtctcnx^ 

tcaocgagc?ixxx:aatcagtgtggccct^^ 

agggaagtcnxxx:tgcxxx3gagagcttggctc^ 

AGJ^GAGAAAAGTGTAGAAAGGTCGGGTGTCATACAQC^^ 

agggcatcaaacacgcx:gtcaagcagggaqgcaa^^ 

gtaatgggtaacaggctaaaccatgtggcaccantco^ 

tggcctqccttaaggcagtagcacmragagcaggga 

GGGAAGAGAAGAACTCnx::AAGATriTCACTATC^^ 
ACACTAAGTATC3TGGGAATTATTTATATCCTA(XXX^ 



CAACCTGAGGCATAAATGGGITTTAAGCTCCATACXAGTAAAACT^ 
AGAAAQGATAGATATAGATACAimWTACACATATWnXMCA^ 
TATreTATGATACATATCCATGCATATACGTGGGTGTACVC^^ 
AAAAAAATCCCCAGGACAGTCrrcATTTGGCTIT^^ 
5 TlXXATTCAGGGCACGCAGATTTGGTTXXriX:^^ 
AGAAGCCCACTGCCCACCCAGAGCCCCTGACT^^ 
TGGAGACAATCXATTCCTTTCACAAAGACTGTAAGCC^ 
ACAAAGCACTGAGTAAGAAGAAAAGGGGCCCGGATAAACCAGCAGG^ 
AAGGTCCCGAQGAATGTGGAGACAriXXnXSGGGGCCAGGGTG^ 

10 TCTCTITCTCTCTCTCT^^ 

TAGTTTCTCAGAATCCTCATTC^ 
AGCATTTTTCAACCAAACrrcAAAGGGA^ 
TTGTACATTTGGCACAGTTATTCTTGAATACT^^ 
GCrTTAGGTAAATTACTTATTCICTTT^ 

15 GGGGCTTCTCnXSAGTATTCATCAAQGTAATATATGC^ 
TmTTTTTACTACTTGCATTATTGTT^^ 
AGGGACAATGTTCACCXnrraiACTGGAAGTAAG;^^ 
AGGAGTCXrAAAAGAAGCCTCTTGrOTGACTT^^ 
AATCAGAATtXXIAGCTTAAATAAATGTTAl^ 

20 CAGGGlATTTATTATTCTGrrcATAATAAT;^ 

GGAAGTAACAGGGAGAAGAACTGTGAGAGATGGAGGTCXAGCCAGGC^^ 
TGAAGATTCATTCATTXnTCCATAGGTACCTAGAGG^ 
(nCAATTAGGOCCTCCTT^ 
AATAAACAGGGTACTITCXXrrcCTCCTCT^^ 

25 CATTGGGAACGCTTGGCTTCXrrcGCTGTC^ 

AGAGGAGCAGAGTTCTTCTGGTCTCAGAGCAaS^^ 

TGAGATGATTCAOTTATCTTGAGrGCCATCTTAAC^^ 

TIXSAAATCAATCGCAGCTGGACCXXSCGTGC^ 

^_ CAACCAAACGT^CGGOXSTGTGATTT^^ 

30 TTTATTTTATTTTTATTTTTCGAGATGGACm^ 

GCAAACTACCCCCGCCCCCGGGTTCAAGCAATTCSTC^ 
CAAGCCCGGCTAATTTTTCTATTTTCAATAGAGAT^^ 
AGCyTCATTCAOXACCTCAGCCTCCCAAAGl^^ 

AAACACATTGa^TAAAATAAAAAGAAAAGGGATTGTCCTAAGTGA(^ 
35 AGAAGTCCACATTGCTCCATTTGGACTTCCI^^ 

AGCTTGCTAACACAGGGAAAGACCAATTATCATGATTATTATTAT^ 

CATAGACATTTGGAAGCAAAGAGAAATGCTTTTCATGTGa 

CC<XGCACCX5GGCTCTTGGGAATATCTC^^ 

ATTCTAGGTGTTGGGGATACATCCAAAAACAAGCAGGAAAACCC^ 
40 CCATGGAGCCCXnrrCACTTACACGTTTT^ 

ACCXriCTATXrACAGCCTITCCTAAAACCACTGAAA 

ATTCCTCTAGGGAAGGATTTTTTTTITm 

GTGGCGCGATGTCAGCTCACTGCAACCTCCG^ 

ATTATAGTCACCTACCyVCCATACCCAGCTAATTTTTOT 
45 CTCTCGAACTCCT^ 

CTGOTGCTTTTTCAGAGCCTCGAAC^^ 

AGATGTGGGGCrcTGATT:CrcAAGGCC^ 

GCCXnxXXrroCTGCXIACCACCGCCTT^^ 

AGTM^AA^ATOT^ 
50 GATOICTTTGGTTIGGCCTGCT^ 

AGACATCXXlATCTUVAATTOGGAlTTCXn^^ 

CTACACmXSGCTGGACITGAGCAGCGGCTGCCCC^^ 

GGCXTKXrnCACTCATTOTAGAAATGGGC^^ 

GCITACACCTAGCTGTTGTGTTGGTTC^ 
55 GTC^TICIXXCCAGT^^ 

TGCCrrrrCCTTCACCAAACCAAGTQGC^^ 

GlXnXSAGTTGGGTAOnTGCATCXrrCC^^ 

ATTCnXrrcCTGGCTATT^XAGGC^^ 

ACAAGATCnX^GCCGCTACGTGGAAGGACACATAGTTC^^ 
60 AAGGGTGAGCTTOGAGATAGAGCTGACXriTCCCA^^ 

<^<^GAAAGCACTGGGAGCGTTTATGAATGTTGTGCT 

CCTCTGGCrrTCAGGGAAGAACGATGGAOTO^^ 

AGATCATGGAQGCCAGAGOrAGGGATGGATATAGACTAAGACAT^^ 

ATTCTAGCATGAGACTGGGAGAAAGAGATCmC^^ 
65 CAAGCATTGCCiaSATCACCTTGTTAATATGG^ 

TTjJ AGCAACACTATTA GCAAATGGC^^ 

TT TO:a-A- i-i-iviu'l-lU'AATOCICCTCC^^ 

GACTTTGATCGTGTGCATGTATTGCAGAAATGGA^^ 

CX:ACAAAACAACAGGCTCX:ACCCTCCTTCXr 



TGCAGJAAGCCTAGCnCCCXy^GCITCTOC^^ 

GCTGTTATCTTGCTTCAGAGACXriTAGT^^ 

TTirCTAGGCXATTITTITCTTTOr^^ 

AGTAGCXSCmTCTCCAGCTAGTAGCTCAGAATAAGTCAT^ 

AATCTGTTTGGATCTCAAATCTTTCTT^^ 

CTGTAlTTACrcGlTAGAATAGAATATTATTACCATCA^ 

tccx:agcactttggc3«^ggccgaggtgggt^ 
aa^k^ccx:cgtc^wc^actaaaaatacmaaaa^ittag<xgg 
gaggctgaggcaggagaatcccttgaacctgncxsag^ 
ctaggcaacaagagtgaaactctgtctcaaaaaaaattaatrataa 

ATCATlVNmSAAACCAGTTeiTAAGCT^TTAAAAAAAAGCAGC^^ 

TGRAGGCATGGTCATWRTGAATGCCWAAGAAATGAACAGGAGAAAATT^ 

AGAATTTTTTTTICTTTOXITATTGCTGTOT 

CTACAATTCICTAGATCAGAAGCCTAGACACAATGTGAC^ 

GTCCXSCTGGAGGAACATCCTTTCCTGCATGC^^ 

AGATCCTTGAAGTTGTAAGACTAGGGCCCCIXmTCCTAC^^ 

CGACATTCCTTGACITTTGGCCCCCTGC^ 

TTTGAATTCTTTCTGACTTTTCTTO 

AGTTCCCTCCTGCTTTGAATCTCriCT^^ 

GATAGTrrCTTTATCTTAAGGGCAACTGATGT(^ 

TATGAATTGACTCAGTAACTGGAAGAATGCATCCGTACA(^ 

TATCAAAGCrCCATTTTTTITTGAATAAC^^ 

GTTTAATATAAAAGAAAGGGGGTAATTCAAGAAGATATAAACTGCAAAGAAAG 

AOCAGAGGTTTTATTACAGGCAGACACGGCTTCACCI^^ 

TTCXriTCTATGGCTCAGGATCTTC^UOT 

AACTCATTATCTIXriCTGTGCCTTAACGTCT^ 

TTTGCACCAACOPrAACAATbK3GGTTGAC^ 

TTAGCCCCGTGGAGGCATATGCTCGGCACTCAACAA^ 

AATTGGATGCTCAACTCTGGATrTGATCCAAGC^ 

TTTGGGCACAGGAGNAAGGNAAATAAGAGCCTCGCGAGGT^^ 

TCCTA^CTCTGAGAGCTCACrITAlXnK::AAAAACCCATGC^ 

ACTATACGATCTAGTGTITTCGTTGCTTACAAGTTAGATTC^ 

TGAGCATACCAGAGTITAAGGGAGGCAAGCAAGGCCCTTAACX^I^^ 

CeITCAG^KrITGAGGACTCCATAAATGTCTGCCCl^ 

TCACTTACTCCCATOXAGAGGCCrcTTTC^ 

TTTGAGGCCATCTGCACTCACTGACAAaCTGAT^ 

CTTTAACTCAGGGAAGACCCTGTCTATGGAAGTGCTG^ 

GTACGGAGTCAIXIIACATCCCATAGCTGAGCGACGTCGGAGCTC 

AAAGAGAACTTGAGCAAACTCTTACKrKXX^ 

TTTTAATCCCTCCCCCTATCTTTCTTTCTTO 

ACAGCATACATACTAATCAGGTTTGAAAGTTGTCTTX3TGAa 

AAGGATGTCTGCAATCAAGATGATTACATGGGAAAATAAATCAT^ 

TTGACAATACTITCCTTTCAGTCTATTTTCC^ 

TAITTATTTCTITTCAATACTCTGAATATCACT^^ 

CTCACAGGTATTGAAGCCAAAAATTGAACAATAACAAOT 

GCACTCOTGTGGGTGAACACACACACACACACACA^ 

AATAGAACGTAAAGGTAGATTTITAGAACX:AAATGCCTTT15^^ 

AATAAACTATATTAAArrATCCAGGAGGAATGGTTAGTGATC^^ 

ATTIXnX3TCCrrcGTO3GCAGATATTAAGTC^ 

ATTCTTCCCTGTCCCCTGCATATGTACATCCTTGCAC^ 

TCC7«rrCCTTGAACX:AGGATGGCATGAAACTT^ 

AACGTCXrrriTATGTTGGTGCTTGCCCACTTGC^ 

CCTTCTGGGCXSATGAGAGACACAAGACCATGTTTCCATAIT^^ 

GCTGACAGCAGACACCrxXXXAGCTQAGCCCAGC^ 

TGTTTATAGCCACTAATATITGrrcATGGTTCA 

TCGGGAGCTGTrcTAAAGAAGTAAAATGTGTAGTGTTTGCC^ 

GTTAlTTGC?rGAGTCAACOCTCGCTGAaX3^ 

ATATGCTACAGGGATATTCAAGAAATTGAAGATCXrrTC^^ 

TGGGAGGGGTAGAGAGTGAGTGTTAACACACAGAGAA(XCAGGAG^ 

TTTCTTCCTAAGCAATCAATTTCTTr^ 

AGCTCAAAAATAAATTTCTGGCACCCTTCTIXX^^ 

ATlTCACCTATTTCCTTAaTICTlTAAA/^^ 

TATTTTTTCTTTTCTTTCCTCT^ 

TTGTATTTATCCATGCXATCTGAAATCCTCCTT^^ 

ACTACXTAATTGTTCGAGCCXrrAGGACXXrGAGAGA^ 

ACTGGAAGGAACXriCAGriTTGTCCATTCATTCAT^^ 

TGGTGCCTCTAAGTTTGAAGTCTAACTCCAT^ 

ACCAAAAACGTACTGGCTTTAAACAACACATGCITATCAT^^ 

CCnxSGTCAAGATGTCAGTAGGGCTTOTn^^ 



GAGGCCTCCTCXIATTCCTTGCXri^^ 
TGATTCCTCTTCCCmxnCCCAT^ 

TOXXyVATi«X:A(»TAATCTTC(»TAATCACTCCATN^^ 
«nX3ATGACAGAGTCACAAATTTGGGCaU^TTA(nX^ 

TAAGCTITATCCAGGGGGCCACTCA(?lxaVTCGGAGTCT^^ 
TGTGGGTAAGACCATCACAlTTTTCATCCa 
TGTirroiCKXCTGGATGCXriTrAACATA^^ 

GCCTTTGTCnX3TC(nxrKX^^ 
ACCCC(n«XCCATCATCTCAATTX3AAAT^^ 
GCTTCAGa:ATATCATTITCG^ 
CrmACTAAGTGATTCAGAAAATACX:^ 
™?reAGTICroCOVAAAACC^ 

AAGTGCTCAGAGGCNAGCCGGAA(XTOIXKnT^ 



..v-xv^,. »v;,vi„iv:^^i i '«"'--^i«Ai«t;UATAUANCTGGGCAGGTTAAACAGa^ 

ACCCC(n«XCCATCATCTCAATTX3AAATTAGT^ 

CAGa:ATATX»TTITCGGGGAACTTGGC3GAG^^ 

TACTAAGTGATTCAGAAAATACXIAACTATAGTAGTGaSAGAACATCCTATX^ 
'^^^^5'™'«5Cx:AAAAACCflA<nTn^^ 

AACTGCTCAGAGGCNAGCCGGAA(XTX^^ 
CTCTCAGGACACXSCXrtCA^ 
CTAGAATTCXrrC^ 

TTCTlTCATrcATCAATCAGAG^^ 
CTCTGTCAATAGTATTAG^^^ 
TCATAGCAGAAACTCACATCnrnxnGCTT^ 
CACCCTAAGTGATAGGTGTATOCATCATIXnCACTATCAG^ 
CAAaXX:AACCCTG(?K3AAC3GQCGTGAT^^ 

CCAGCACTITCGGAGGCCGAGO 
CCGNNTCTCT 

CGGAGGCTGAGACAGGANGAATGTCATGAACrCGGG^ 
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25 



AA^GGITCCCAAGCTCGCCGCAGCC^ 

AGTTTITCTCCCXXIAACTCMGCXACCAAACCAG^ 
*^CAGAGTGTGTCAAGCCTGCAGCTCGTCC^^ 

GTQGCTCA0CGGGGCaTKX:GAGGTTTATTAATCrTAG(^ATAATr3^ « 



GTOGCTCAOCGGGGCOTKXrGAG^ 
TGIT^CTACTCCGCGAATTCy^C^a 

GA^^CGCCGTCGTAAAAATCCGTCXX3GAAT^^ 
ACTITCACCAAAAATCAT^^ 

TTTAGAAAATATATGAGGTACGGGATGAAGATGAGGGAGGAAAATO^^ 
An '^g^Gft^TT^AACCACTGCTITCCAACnTATATATTAAAT^^ 
GCCTCCT^TTCAAGC^^ 

^CTCGGCCTCCCAAAGTOr^ 

45 ^CTAGATarAlCTAATAATCCTAACACTGTCAlXXr™ 
TTAACAGATGACnV^ACTCTGGAm 

^C^GGATACCTAGGGACX^X^TCG^^ 

TGGTTOAAAATATCACnXmr^^ 
OCA^GGAGCAGTroGAGAAAAOM^^ 
CTTI^ACATATATGAITATGAAGCAATT^ 
f^-I«5CTGTAClTClCAATACTACACnXJATO^ 

^^ATCCnGCTCOTAGAfiC^ 

^ATCGCy^TTAAAACTCTAAAACTAACAGGTCGATATAA^^ 
^ATIT^CTTOX^AAO^ 

TC™Trj^TATAAAAACTTGCXn«ATAAATGl«Aa 
CTAATGCAACA^ 

65 TTTrAATTITrCTTITATAGAGATXXJ«n-ATC«r^^ 

CAGTCGGGGAGCCaGCCAAAAAATCAGTAAACACCAAGATAAAATATATIXXAC^^ 
AAACNATXWCACTO^GCTCXIftAAT^ 



ACAGGGTTTCAAAGGTGAGAAGGAGCTGGCAGTC?rcAATAGra 

AAACXXnXXXXrrcGC7U«3GCAAAGATTATTCAT^ 

ATTrTTAAATTTAGAATTGAGGTTTGAAGQGAAGT 

TTACAGTAGAAGAGAAATOTACATTTCTGAAAATCCACTGAGG 

CAGACAGCCTTTGCTTTATGTATCAATTTTTCT 

AGGTAAATGACITCCCXAAGGTCATACAC<XAGTACA<^^ 

CATTAAATGTCSTATATATATATATAGTGTQCATTACAGCTTAACAGACACACA 

TATTAATAAGTTTTTCSTQGGCATAGGTCTTGTTTT^ 

AAACTGTCCTATAAAGAATACCATATTATCGAAAGGATGTTC^^ 

AGTGGCTCACACXTIXm'ATCTCCGCACTrAAGGAGTCX^ 

TGGCCAACATGGTGAAACOXTKriCTACTAAAAATAGA^^ 

ACTCGGGAGGGTCAAGCAGGAGAATCGCTQGAACCCGGGAGACAGGT^ 

GCCTGGGCAACAGAGCTAGATTCCATCTCAAAAAAAG(^ 

CAGATTGGCAAAACGTTAAGTACTAAGGICATAACAAGCACTGGA^ 

GAACTGAAACAGCCATTTTGGAAAATATTTTGGTATTACGTAAT^^ 

ATAAAGATGCCTTAGAGAAACrKOTGAGAACTAGGAAAAGCl^^ 

CCTCAAAGTCCTCCAAAAGCCGTCTACAATGAAAT^^ 

TAATGGGTGGATACCAGAACGTGCATCAACACAGAAGAATCTC^ 

GGGCACAGTGGCCCACACCTGTAATGCCAGCACTTTGGAAGGCCA 

CXAACCCGGGCAACATACTGAGACCCXZATCTGTACTAAAAATAC^^ 

TCAGGGGTCTCAGGTACTrcGGAGACTGAGGAQGGAGGACC^^ 

TGTACCACTGCACTCCAGCCTy3GGCTACAGAGTAAGACa^^ 

AGAATATATaCAGTATGATTX::AaTTTATAGAAAGCTrAAAGTATAGGC^^ 

ATAAAGAAAaCCAAGGGAATGAAGAATGCAAAATATGAGCTAGlXnTIT^ 

GGAAAACAGAATGGTGCCATGGCX3AGAGATACACAGGGGAGGTATTC3GA?^^ 

COTAGCTAGTCATTCTATATGTTATTAAAGTGTACACACAT^ 

TAATCTGAAAGAATCTGCTTCCTGTAGTAAAAATTAAAA^^ 

CACITTGGGAGGCCAAGGCAGGCGGATCACX^AGGTC^ 

CC?^AAAATACAAAAAATrAGCCAGGCATGGTGGCGGGGTGCX^^ 

GGCGTGAACCCAGGAGGCAGAGCrrcCAGTGAGCCGAGATCGTGC^ 

CATCrrcAAAACAAACyWVCAAANCAA?^CAAAAAAGGTTT 

atacttggcicttttgttttttcttct 

tacaccctgtgatttgaattttccatacio::at^ 

atcactcctctgctcaaaaacattcagtggctra^ 

tlkxrccagtgtactctctgtctcactaatttgt^^ 

aagatgggtagcaggaatgctgcactgaaccataatccaa?^^^ 

CT^TCnXXriCATTCTCGATCICCACTTGAAC^ 

ATTAACACAANATTITCCAGCTTAAAACAGCAC^^ 

GACTTAGCTGGGTCCTCIXXrCTAGGGimm^ 

TTGACFGGGGAAGAATTGGCTTCCCAAATCACCCAGAT^ 

GGCTTNCTTGCTGACTGTTGGCTGTT^ 

GGCTTTTCCXACAAAGCAGCriTCCTIK^ 

GTCAGAAGTCCrcCACACATlCAAAGGGAGGACAC^^ 

ACCITACAGTCTATCAGCCACAACTTGAAGTATCT^^ 

CTGAATCCTATCTAAAGTTGTCGGTGGTTGCTTC^^ 

ATCTGCCTTCAAAGACTACTACCraraSTTGT^ 

GTATATGTCAACCACATTTTTAAAAATGCATTTCrre^ 

AGACAAAGACATATGATCTGACTAACAATGTGGAATOTGA^ 

GTQGITCCTGGGGCTAGGGGGTCAGTCAAGTGGGGAGGTGT^^ 

TTCTCGAGATCTAATGTACAGCATGAGTGGTGATGGAT^^ 

TAGTAATCAAATCATTACTTTATAGACXXTGAATATATTCAATAT^ 

AAATATATATATATATATATATATATAAATATATATATATATATATATATATATATATATATATATATAT^ 

ATATATATAGATATATATAGATATATGTATATATACATATATATATCT^ 

AATCCTCCTTCAGATCGAATGAGAAATCATCTCT^^ 

CTGACrrriTATTTTTTTTCAAAGTIOTTAAGACTAl^ 

GAACXnATTTGTTTAGAGGATACrrAGCCTGAAAtAGC^^ 

citacacctacxxatragtctataaaacixxtgagggatgctc^^ 
ggcacaaatcgtttgatttcacccxatcc^^ 

tgaaagcctagcctgaataatggatgacacxxricctaaagcc^ 
aatttcattaaatcacaactctatatacttitaaccc^ 
agatggaaaatgcctagataaatactgcatgggacattittaaacatgta;^^ 
tttacctctttatataaaacatocratattaaaagagagattaaat^ 

AlTl UU 'i' i CATTTTTCAGAQCATGTACTGGATTATCATTTAT^ 

TTTCTTGAAGCACAATATAATATTGAATOSGGAATGTT^^ 

TGTATGAATITTTAAAAGAAACITCAATATGCrAlX^ 

TNAAATAGGCTKXXXSCGCAfiGTGGTGTATG^ 

TCACAAAATXnCCAG CnnUXJ ' in'l C iU ' lT ATICAAAAAAAATRCT^ 
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ATCTTAATCCTITC^ 

10 GAACATTCTICICITCG^ 

TIXX3CTAAACnCA(m>ACATAACOA<XCCAG^ 

AATGAATATTTCCnTCACa^GACTTI^^ 

CCTTGrrcAGATrnTAGOm^^ 

15 AAGn^roCAGCCCT^ 

TACATlXO^TCATTTTOCAanT^^ 
CACICTAlTrcACCACAlXnTOCTAGCACST^^ 
GACATGACirrcAGirrAGCACATOTACAGATCACCaT^^ 
20 TTItaCAGAATATATTCAAGOSTGGm^^ 
CACT^^AO^ACAAGAACAT^ 

ANCA^ANCACTMOflCA 
CACTGGATTACnCTCTTtXrriTCAATr^^ 

TATATATATATATATATATATAl i 1 U 1 U 1 1 1 i i 1 14TmTm X»ima(?ICTCACCC^SS^CT 
GCAGTGGCGCAATCTCGOnCACTCXAAGCT^^ 
GGO^ATACSGCOXXACCACCATGCCn^GGCTAAlT^^ 
TajTCTCAATCrcCTGO^^ 

CTOGOGAAAAlTATATArrATACACACAAAO^^^ 
GTrATGCAGTGTTreAAATAAATTrATI«X:AGCTAT<^^ 

OTATTANCCTCTAATOrnmCTACXTTOnrc^ 

CCACQ^CCANCAAAAGANGGCAC?rAACATTmCTArrACArrATACAT^^ 

^AGITATTCTGGGTGTTAAAAAIO^AATCAT^^ 

AGTAGAGTCnX3ACAGCATATCnCCa«:a3TCnX» 
GCGAAGAGATGTCnX^AGCCAAATTGAAGGCTA^ 

^CGAGAGA{^GGATtXTAGCn>GAGCn«nxn^ 

CCAGGAAACKXriCTACTTC^ 

T^^CCGACGGAGAAGTGGGA 

<^A(Xn«AGTrGTtaGAAATGAGGA(aTAATATC^ 
ATCGACTT^^ 

CAATXKrKy^GCACCAATTAC^GATTGATOXCrci^ 
AAAAACAAOaCTCC^^ 

TCAGCTCCAC^TACT^ 

ATTOAATAATGNCO^TCTATGATTKXSCCTTCAl^^ 
CXnTCTAArnTCTITO<?rA£JIX?^^ 

60 "TAATCATATCSAAAATCAGCO.TCAarTTATTAGA.^^ 
TCTGTTC^AATCATOmxn-ATAT^ 
TCTGMCTT^AAAa^^ 
GGGTCTTCAGC^^ 

65 

GTCACTCCCAACG^ 
GCTOCOCATAG^ 

ATXnxnTGTAACACCAGGTTCCCTAC^ 



AGTTGATTTCATAlCCITTrATTTCATAAAGTTT^^ 

CTCGAGATGACCTTAAAGTATCriXyorCXXAAAT^ 

AACTTGGGGCATCAAAACATTGTTTTAAATATrrcGTTICT 

AGGAACGCTGGGTGCAGAAGCATCTTrcGTTTCra 

AAATGTACAGTTATATOITCAGCACAAGTGAATTATI^ 

ATGGCCAAAGTCTIXnCAGGTGAAGGTAAAAATCAAGTAAGATC^ 

GCTTGTCATXrTCAGGTTACIXOTXS^ 

TATTTCTTACAAAGTGGGAGTATCAATTTITCT^ 

TCACGTTGATTGGCTTGCCTACATICl^^ 

AGACGTTATOIXXIAATCCTTGTtnTTCT 

GGaTGATTGGTCCAACCITAGCCCGATACCTCT^ 

CAGTAAACTTGTtrrATQCATAGGGTTGmTAGAAAAGTC^^ 

TCAGGGGTCXriTATTXX:ACCCITTGAT^ 

CTAGTGATGTACCGGGAACCrcCTATGGGAACCAGAGCT^^ 

TCTKTITOGGGGCl^TCACAGTCTCATGAC^ 

GCAAGAGACCTAGG?ATTCACAGAGTGAAGATGATTrcT^ 

TTAGGAGAAGTGAATCriTrTGTTTTCGAAGCCT^ 

TTAAAGATGTCTGTTCCTGAAACTCAGGAGGCTGC^ 

AAGGAACAACTGATIXrrCTTGGCXXrrCXnT^ 

ATTCTGACAGTACTCAGATTTGCriTCATG^ 

GTATCTTCAAAGGAA^^ITTAGAATCATC^mrr^^ 

catcccaggtcataaaatgtaatgtgggattgtaggttt^^ 

cttgaaatgtcagggtccccx:attcatcttc^ 

caatgtcaagcxatcaggcagcgggttgggg^^ 

ctagagggt^tt^k5ag^taagtgcttttgacat^^ 

ctgaai^gagtcxractgcagatgtaattgagcct^^ 

cctfftttgggggaatattitaggcatcctcacaaaac^ 

aaacacttcaagaaggaaaatgctcttgcagaaat^^ 

CTllKrrTTTTGTGATCACAGTGTGT^ 

ATCCATGAGATCTAGGACCACGTCTGTCTTGATCATT^^ 

CATCACCATCACCATAGCTAGTATTTATTGCCATATACAGTrATC 

(XCAACCATXX^GTCATGAGTCAAAGATCSGTCAAAT^ 

ATCCATAACCCCAAGATGTGACTTTCCAGQCTTGGGAT^ 

GAGTAAAACTCAAACAGTAATTCCCATGTGCTCAGTTT^^ 

TAACGGGAGCCCATTATAAGTGAACTGAGGTGTTATTTTAATATATAC^^ 

AGTGAAAGAAGACAGACTGTTACCAGAGGACATATTCTGTT^ 

ATTTICCATGTX^AGAGAAGTCTy^AATAGC^ 

GGTATGCIGGAAATTTTCCATTCATCTGGIGATG^ 

AATATTTGTGCACGTCGGTGCGTGTAAGCAATATCTa^^ 

CCCGCCTCGGTCAAAGCACACAGTGGGTTTTCCX:^^ 

CGAGATGACCGCTTGACCTAATTCCCTATTCAGCTC^ 

ATCNTTTTACTOCAAAANCCTGAACaT^AAAAA^^ 

AGATGTACAAGGCTGmX3GTTCTGACCACCrCCX3GGCTATT^^ 

ATGTAATCnXIAAACTCAGGCTGAATGrrTGATCT 

GGTAGGATCTTAGGTCGAATGGAGACCATCACCXrrGGCT^^ 

TTTGGGGATCGGCCACTTACTTCnXXSGGTrGACTCTO 

CCCCAACTCAACTITCACTGGCCAAAGTAC^^ 

GTGGlTTTCTCXrrcGCITXXACXri^ 

TCGCAGCACAATCCGTTTGTTGTTGCTGT^ 

TTTTTGTTTTTTTGCCnT^TTTGT^ 

GGGGGCAGATTCCCGTATCATGGTGCTOCXnTGGC^^ 

TArrGTCTOTTTCTCTCTTCCAAGC^^ 

TAATCCCAGCTIGTTTTGGAGGCCGAGACICGGGG^ 

AGATCnCATCTCrrACCAAAAAACAACAACAACAAACAAAC^^ 

TAATCOCAACACTTTGGGAGGCTGAGGTGGGAGGATTG^ 

GACCClXnTTCTTAAGAAAAAAGAAACATTTAATTAAAAATl^^ 

TGCATTTCAGCXTGGGTGACAGAGCAAGATTCT^^ 

ACTCTAGGTAACCTGAGGCTAGACAAACCTCTCC^ 

TCTXXrrcCTTTAGTTTTGTTTTTCT^ 

atgctgttttactcattgtctaatlgaattcr^^ 

atogtgactcagattaagaaacttgctcaatgt^^ 

gcctgttgaatxnotaggattltgctaaatct^^ 

cxattaotxttttgacaiagaggaattttgattc^^ 

ggtctataatttcagactcaggtttgtttaagtc^^ 

tatccccaatggaaaaatcccitctcccacttt^^ 

aaacctotactaattcatgaggcaatcotatttgat;^^ 

GCCTCACCXrrXTITOCCXilAGGCTGGAGTGC^^ 

TATCCTCTTACCICAGCCTCTTCAGriX^^ 

TTTTGTTTITGTAGAAACAGAGTCTCTCTATG^^ 



10 



CCTCCXyiCAGTATTGGGATrACAGCnxnXS^^ 
TCAGAGAAAGCAACCAAACXXSAGCATAACTCrr^ 
TATCATCATCCCATTCATAATTXSATTCCAAA^ 
AGCGATCACTTGGTCnxXrrcAGATTC^^ 
CACiroriTTOGTGTTQGATC^ 
GGAGGTGCITGATGGCTTXSACTCTGGCC^^ 
CK?rCITGATCIXriTATAT^ 

CTTGCTGTCAGGAAACATTAAGTTTCCTAAGTGA^ 
CCTCCCCATTOSAAATGATGGCAGTGGITACAAGGG^ 
AATCnXSACTTCXrAATTCAGTT^ 

GAAGTTTCCCCTCGGTQGCTCX^ 
CAGGTTCTTGCAGTTGGCAAACAGAAGGAGACC^ 
'"T'AAmTCCCAGCCTGATrTATTGAGGCT^^ 

TTTTGTAGTGGCCTCACAAGTTTCT^ 

GCCTGKSAGGAa=AGTTAGTCTATTITTTGTA» 

TTCTTGTrcAGTOTCTIT^^ 

ATTGACGTAGCATCACITTATTGAGCAGCTACTATCT 

CAGAGTGGTCTCnXnXXSAGAGGTCXIAGGTIT^ 

^^^^^^^GAGGAGGAGGAGGAACTTCICAO 

CCACTCWrcCAAAGCCTCriTGC^^ 

CXnTATTAAGAC^ 

CT^GTTTTGCX:AGGCnTCACn^^ 
2^ AGTTTCTTATTCXXSGTAAGTXXT^^ 

GCACAAATGAAGAATATTTAGTCCATTCTCn'ATrACnc^^ 

ATCTAGTTCn>GATTTTATTGAC^GTCATT^ 

TTTCACTAGCTTTATCnXSGCrcGCA 

AAGAGTCGACCI^X3G^O^GAACCKX^ 
JU CAGTAGAGAAGTGGTTGCTTAAGCAAGTAGTTTCCGCT 

GATIXXIATCTAAAAATAAAGACTlxn^GAAGGTAAAACTN^ 

TTCTCATTTAATAGGTTGGTCATCTC^^ 

TTAACTATGATGCTACTIXTrTAGCTGCAAGGGGGGAAA^ 

GGCATTTGTiaSAATrTCCCAAGAACAT^^ 
J3 CAACAAATTTTCnXOTXXriTOT 

CTATCCITAAGGAGAGAAACACATATAGGATAGGAAAATAGOTXX^ 
'TC'TCTCTCTCTCmcraCT^^ 

GGCIXSGAAAATTCAAGATCAAGATGCAGACAGGC^^ 
ACGTTGCTGTGTCCTTATTTGGTO^^ 
4U AAGACAGATCCAAGATGGTGGCNAATGGTCAAGAAQQGTGQ^^ 
GCTCACTGAOTAAATAAATAGGATITOKXXT^G^ 
CATrcAAAACACAAAGTGTATTTCTTGATACT^^ 
TWGTGTGTTAANGGCNAGATGGAAAaxn^^ 
^^<^^2^GTAAGTAm-IWGAGNAAGGCANCAT^^ 
CAGTGOIAGGTCTAATCXXXrnCATTTG^ 
CKXXSATGGTTCCANG^GAGTTGTTA^ 
GCATTTCTAAATCrrGAGTXTCTGC^^ 
COTACACGTTACAAGGCGCTCAACTCACGGTGGC^^ 
GTCAGCTQGTG^AGGTG^ 
:>U CCXarcCTTTCTCXXTTTTTCAAGATC^^ 
GCACCTIXriCTTCCTNATATCniACCT^^ 
TCTCCTTGACCAGAACTCnTVTCAAAT^^ 
ATGGNAAGAGTAACAGCXXrAGTACACAACTCnCTCC^^ 
ATCCCCTCTTCCCTCICAT^ 
DO ^^^TAACAATGTTAACATCAGGCT^TGGGTTTT^ 
TTGCTTCyVGlTCTCTGAGTCrcAGCTG^ 
AGAAAAGACTGAGATAANCTAAAGTACCTGATTGtf«^^ 
NrraGGTCXXXnOGGXHTAOTCTAGTC^^ 
'^^^^^^^^^^^^^'^'^^^^^^^^^^ 

ATGCGCTCAAGCGACCATAACCACCAGTGTTCCTAAC^ 
TTATimTCXXTOAGGTAAGCATCGATTTTAOCTT^^ 

ANTTCAcaroyu^Gcra^ 

liiv,i-i-ivx-iuu-iACTC7n?rCACTT^ 

GTGGAATTTOXXrAAGGCTAGCATGATAACAA^ 

TCAGAAGTTCCTGTATTAAATTACAAO^TAAGAAC^^ 

TCCATAGGACGCAGAGTTATOX-rcAACSCaXr^^ 

ATCAGG^GCAAANGAG^^ 

AGQCCTCGCATATTTreGCAGGCCT^ 



60 



GACCC7^CATC®m3GTAAATTCTIXXACATAAA^ 
NCTOX3mCCGTO:ATC(3^TGATCCGGG3^^ 
ATAACrcxXNTGCCCAGAATtrcACCAGGC^^ 
GAGCTTTCOTAAGAAGCXXXIACCa^CACTACT^^ 
. GCCAGGGAGGCTGGAAAGTACAGCCTTCAGTCCTAGCA 
NGAAATGACAACGAATAITTGQGANOCCACTGGGAGT^^ 
CTCCCCIGCa^AGimSTATCCTGAT^^ 

attatatoxxxiagactccnxtitaaactcctg^ 

aggcitgaaccaccatccctggccitgatac^^ 

ccgtcacccttctactatcccatttk^^ 

tcctaccagccagaggtgggcacatgacxx:aggtcagacca^ 

ggaatgagcatatatcaaaccaggacagtccaagtccri^^ 

CATAACTTGACCACACACAGCTAAGCTOGTGATCC^^ 

cctgttgtagttaaaaaaaaaaagagagagagagagagagactgatac^^ 

aancaaagatagagctagcatgattcx:aagctgaggacatacg^ 

gctgtagacaaacaaaaatgatcttrcggto 

tgtcccactatcattatatccaacitgtgtaato 

atatcgtgaqqcxsgagatgtgagatgtgatcacgtttata^ 

CITCTCAACrCXrTCCTCCTGGCT 

TTTITAACXXrCATATACGAAGCAAACCIATXmn'ATGT^^ 

AAAGTCGATTCCTGACATITCCGTAGACTGTTTT^ 

TATTCTTGCATTTGCTGCAGCAAAACC^^ 

CCCCCCXXlJCCCACCAAATrcTCCCGT^ 

CTAGGTACAAATCCCACATGAAAAGTTKrrcAATCA^ 

TTCA(XTCCTTTCCTX3GTTGTTTCC^^ 

ACCTC7«3GGTTTGTCATGGACCACACCTIT^ 

TTTAAGTTTTTATTTAmTTTANTTTAGIK^ 

CAATTTCATTTTCATAGTlTCTCTGrm^^ 

AATTATAATTAAGAGAACCAGAAGGGCCTGTCTCTCAGGAAAGC^ 

TGAAGACTOKX:GGAAAAGGGACATGGCCC^[AGGGAACC^ 

ACCCCAGGGTGGAAGTTGCXIATGCGCATGAGGCI^^ 

AGCAACCICTGAGGATQGGTAGACATATACITCATTGC^ 

TCCTGCCAGCAGGTTTAGCAGTCG^GGGAC^^^AGGGAAGT« 

TCCTTTITCCTGAATATGGAAOCAAAGCCTGTXnT^^ 

CTGACAGCCACAAGACrClGCCTCCCTGC^^ 

CATGTCTTTGCAACTAACATACTCACTCAT^^ 

TccTCAATcraxxxrrccccAct^ 

CCCCCCAGCTGACCAGTGTGGATGTCTTCAT^ 

GGCAGAAACCKSCACTCAGTACATCTATACATGGTGTAATGAT^^ 

TITATTATTTGTGATGAGAAGATTCTUVAACC^^ 

TCACCTTTCTGTGCAACAGGGCACCAGAATTTATT^^ 

AAACXTOAACTCAGGAGAGTGGAAAAGAACTTCAACATIC^^ 

AATGGTAGGNAAGATGAGAAGTTTTTTACTCTTTGTT^^ 

ACTATATATGTATTATACATGGTAGGACTTATTTTTTTAATC^^ 

AGTGATATCAACAAAACAAAACATGATATCCAAAGTCACCCT^ 

TOGTTCATTTGTTTCATGCAATAGAAACGriTT^ 

TGCIXnCAAGGTCTTGGCXXriTCAAGAGCAT^ 

GGTGGGATGACACAGGCTXXriCT^SGC^^ 

AGGTGACT1CCAAGATTGAGGTK3GCGAGTTX:G^ 

TGTGCAAGGACCAGCCAGCrcAGGTAGAGGAAA^ 

ACACGGACTGT<7rcAlTGGAAGCCAGC3^GGG 

GCCCTTTCATTAAGTGAACGAAGGGTGCATGGTGAOT 

GCTT^XXXTOXATCTTTCTACWrc^^ 

antcagccagctctccaaagcaagccc^ 

ttctaqctatqcagatgcagtttaaagacaaatcctt^^ 

cccctaccx:tcttctgcgtagaagcgct^ 

GACCTATrXXnTTGTTTATTTTAGCCGGri^^ 

AGTCCCCAGGAGClCTTGGGAAGTCa^^^ 

GTCAATXrAATTAAACXSGATACnATTAGCACCTACT^^ 

TCATTTGGATTCTACGGTCGCICTTTAC^^ 

GTTCXrACCATTTlCrcAGCAATCT^^ 

TGCTTTCAGAAACAATACTGGTGACCTGTTTTAAGAAAAGC^ 

TTAACCTCATTAAGriTTAAAATTGTITrTl^^ 

GCCTCAGTGATCTTCTATAACATAAGTlXSATAGrrc^ 

GTATATATATATATATATAAAATATATITCCATGAATATACTGAATGCTAAN^^ 

TTAAATTCAGAAANTTTCTAGTTGTGTTTATTT^^ 

TGTAAAANCTACTGAGTTNATGTTTGGGTCAT^^ 

CAGCCTCKXriTATTAGAATAaGATAAAGTATTC^^ 

TTAAGCATGAAGATTTTAGTCAATGTGCTTTCAAAC^^ 



NAATCTATAATAGGAGGAACTGATTTTCAAAaVTCATATI^^ 

TCnxn-AAGACTOICIXnCACGTGGGAGGATT^^ 
AGAGGGGAATGCAAATTCAATTTATAAGCTATTATTAA^^ 
rrCATITKXriCTCATACAATTCAGCAAATAGC^ 
ACCAGO^GGCTCTCAGGrcanX^ 
CCAACGTTITCAAAAGAAATCTCCTITC^ 
ffGA^TAATMGO^^ 
^GGCmn^TGC^^ 

TCAAGCTGCTCTCAGCTAAACTCCAGGGGTI^^ 
ATOTAAATCACCTCCAAGAATGGGlXrreCGI^^ 

T^CTcroreAACcrc^ 

TCGGCTreAGACTCTGCAAAGTGCArr^ 
CTGTOGC^ 

NCCXX^ArmSGATOGGCCCTOGGIOlATGTGTAGA 
TOXCCACTTAATIOTICCTGACACAGmG^ 
(HClCAGAGCGGCCrCTGAAGAGACAGGTIGGACCAAa^ 

AA^CTCTGGCaiATTGGGTGTOGTGTCC^ 

A^AGGACKyU^GATGCOGTGGGGGTGTAOTAGTC^^ 

25 Ae^*?*='^'^'«^ 

ATCCGATCmaSTGAGACTIXnTCACTATCACGAGT^ 
CGGGGCCO^^ 

AGATGACGGAAACAGCCCCCCAGCTCCAGOCATITC^ 

AGATGAGGGCAGGATACCACT^^ 
Trcra^CTGCACAGOTACCXri^^ 
ACTCCCCTOGAACGACAGCAAGW^ 

GGGCATCTGTAGCITTGGAGGTACCT^^ 

AAAGCATTGCCTGCACCGTO^^ 

C3TCAGGCXXriTCAGGACTCAGGGTGTATAAGTtav(^'^^ 
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ACTGCOGAGCTGGCT^ 
GTGCACTGl^^ 
AAAATOOaAGfiC^^ 

cacgc7™aacta^ 

CNAGTAOTITCCCATrcGTATITrATTCTCXr^^ 

Ara[GACAGt«X:AGACTCGA^ 

"I^SCCTCAGOCTOCTGAGTAGCTGGGNACTAAA^ 

TT^TCTOTCIGATCATT^ 

OCTTCTTAGATICAACAAAATOrW^^ 

GGQCCTGTOCTCICAOavG^ 
MCAGGCAAACTACATGCATTTTAAAAAA 

AAATAAAAAATrAGCTOGGC»rocn«X»GCCAOC^ 



AACCXrAGGAGGAGGAGGrTGCAGTCAACTAAGATTGrirXC^^ 
AATAAGTAAATAAATATAAAATAAAATAAAAAATATCTCXriCTAATCAM 
TTGGAACAAGTGGGGTO3GGACAGACAAGTGAC^^ 
AGAAGCTGGCTGACATAGTCXXriCACACCTC?r^ 
GAGTTCAAGATCAGCCTGGGTAACATAGGGAAAC(XTT^ 
CACACCTATAGTOXIAGCTACrAAGGAGGCTGAGGCGGTAGGAT^^ 
TGATCACGCCACTGAAITCTAGCCTGGGTGACCGAGA 
TTTIXSAAGACACAAAlTCAfiATGCAlXSTGGATG^ 
- CACCCTTCCTTATTTTCTCTACGACACC^ 
TCATGAATGCTTTAGTGTGGCAGATGTTACTGTACATQGGa^^ 
AAGCACGGlXXTCAGTGATCXriXTrGAGGGTGAGCAAAA 
AAACTTTCACCATTCTTTTCSTAAAAATTAGAAAAATAC^^ 
AGTATTTCTITrAGGATTCTATTATTTACCCm^ 
ATTCCTTCCITCCTTCCCCCCT^^ 

CTCCxrrccxrrcrcTGTCTTCTix:^^ 

CTTCCCTTTTTCCCTCCCTCTC^^ 

CCCTCAGTTGGCAAACTTAAAAGTTCm'AAGTCTGTC^ 

CTAGGAATGCCTCATGTATCTCTATCCTGTGAGTT^ 

TTCATTGTTAGCXSTTGTCTCriXXAGCCT^ 

TTATATAATCAATATTTAGTGATTTCyvCTTCTTAAG^ 

TCATGTCAAGGGAQGGAACXnxIXXriCTTCTACT^^ 

TCCCTTGGCXrCTIXHCAAAAGGGTAGCCAGlT^ 

TAAAAAAAATCAACCAATGCCTGCXXriTTCTA 

CTCAGCTATCCAGAATTACAACTCCGC^ 

ATGJU^TAGCXAGTTATCTTTGCXXrn^ 

TTAAATTCAACCnTTCCCClH3GTTAAGTCX5AGT^ 

ATTAATATXXmGTTCAAATACCTCTTGCATGGAGCAAGAAGCT^ 

TTGAATCGGCAGCTTTGGCTGGACTAGTACTGCAGCTTC^ 

ATTCriKXn'ACGAGGGATXSCCCTCCCAAATTAAAGA^ 

TCXIACAACCAATCATCATGGTTTCGTTAACA 

TAAAAACACATACTTCTTTTOUVCAAAAACIXSGCT^ 

TTAGCACATTTAAAATTTTACMCATTTTAAAATTC^^ 

TACAAAGGAAACGGTTACTrTGGATTTATTCTTT^ 

AAGAGAAGAAAGAAAGAAAAA7\AAAAGGGTXX:rrTGATQCCXnt3C^ 

TGAGCCXCTCTGAACAAAAGCAAAICCTGTTGACC^ 

AGGAGAAAAGAATGGTGCGAAAATCAGTGACACGAAGTTGAATAT^^ 

TTAGGACGACTGAAAGGGCATTTCCCCAGAATTGCAGITTT^^ 

TTCAGAAAGCCACAGGCTXnTXrrcAAAACAGGC^ 

gaatgtiv^ctgagtacctgaacacttgagtgggcagggattc?^^ 

gttcacatcgctgattaatralxmxiaatgatcati^^ 

tctxxxxxracaacctggtgaactgaatagagttccc^ 

agttgcx:agaaatggaatgattcagattggatgaaaagttg^ 

aaacacaataaatacacagttaataaaatacttcatattttattataaaaa^ 

TTTICAAATACAATATTTAATAAACTATATTAAAAATATATAC^^ 

AAATGAAATACGCTTACATCATTTACTTCTCTTC^ 

TCAGCTTATCTTTGCATATATTTCTIXXAAl^ 

ATlTTmTTGGTGCTAAGGGCTOTAATTTAGCAAAGTC^ 

TGTCTTAGACCCGGa.GAAGTCTCX:AAAGAAT^ 

TCITITrrTTTTTTTTTTO^^ 

GTGTGTTTGTTTTGGTCAAGGQCAGAGTCnG^^ 

AGCGTCnCTTTTGGTGCCACGGAAAAGGGCTGGATTT^ 

GCTIXnCGGGGACCTCX:ACACATXnTT^ 

TAAAATCTATCAAAGTATAATTAAATACCTACAAAACAT^^ 

ATTAGATTTCAGTGATATTGAATXnxXXSAGA 

CCTGAGAGTOXTTGAAACCXrAGAGAGGTCGTACGT^ . 

ACXrATGCATGAGACAGAACAAAGAAAAGAACTGTCAAAATACACT^^ 

CTTAAAAGGCCAGTTTCTCTACTCXAGAATAATGA^ 

AACCAGATTGCAAGTCTGTTTGAATCCCCCAAATGCC^ 

CAGAGTGGAAGTTGCXrrTATAAAGCTGCGACTTGGC^^ 

TGTTGGGAGGTTCAAGAGCCACTCTGTGCAG^ 

CCCATACTXnXX3AAGC3GTCAAGTGCAGC^^ 

AGGGAACAGGCATTACTGACATCACGGGT^CCTT^^ 

AAAGCCXSGGGTCGCCTGCOCTOCAGGGGTTCC^ 

AAATGACAAAGGCnCAGGTCGTCTTAAATTCCAAATC^^ 

GAAACGATTTAACTAGCAGCAATAAAATCTCTACTCr^^ 

TCTrcATATTTGGGACCCTTTTTTi^^ 

TACCTTTCTCnCCCACCAGATTTATTTTTA^ 

ATCCnTTCTCTAAAATATATTTGGTGAAAACTAGGAA 

CXyVTCTTGAAAGTAAAACTCXXZAAGTTA'iU'ilTri^^ 
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CAAGGGGAGGCTTAGCACTGATOTATm 
GGAGAGTGGW^AATC^^ 
AAAAAAAATGTTITCTanCAAGTCACATG^ 
CCAGACTCGrrcATACCCCTCXrrAGGAGAC^^ 
!) 'I^CTCTTAAAATGCAGCCK?!^^ 

AAATATAGirrcGTTTGGAAAACAAAGGGGT^ 
TTIXXAAAGCCATCTOTAATCO^^ 
TTTAGTGATTAATTACCAGGTTTCTCTATTAA^ 
<^CCCACCCCCACCCCOCTTTAAGTAAC^ 
lU TCmX^TTCXTTATCntXXn^^ 
TTTCAAGGTCnCCTCACTTOGTG^ 
GCGTTAATTCAGTITGTGATAAAGATCX^ 
GTTAGATCHGGCATIT'AAriTTTCCACAAAAa^^ 
GTAAGAATATTGAAGGTQGTATACTAAAGGAGATAACTATTTC^^ 
GAGGAGCTCAATTCTTATTTAAAAGATAAAAGGTC^^ 
GGAAGCTGTCTTACCTGGACATATTAATTTTAAAT^^ 
GTOTXSAGTTCTCAAAGTCTG^ 
TTTTATATTAGGTCTGAGATGTCGAAAACAGAAAT^ 
TCXXTTCTGATATTATAAAAICTOGAGGI^^ 
20 TATTTACAAATCTGGCATOAAGTAGAACAACG^^ 

GGACACnX^GAACGAACAGAAlTTCAGAGACATCnTI^ 
<=AAGTAGGAAGCAGGCAGAGTTGCAATATCAAGAAC^ 
ATATCOCAGTGTCntmxyiCACCAAAG^ 
GAAATAGGTAGlTGAAATTTOXXXri^^ 
Z:) AAAGACy^TCrKXTTTCTCATATGOT 

AATTAGATAATACAAAAATAACAAMCTATAGGAGGrrcAGGATGGG^ 
GAATAAATTCnCTATCGTAAGGGAATTCTlTGGT^ 
CCCTGGAC^CTCCCTGCCCTCCT^^ 
<=^AAGCAAGACCAAGAAGCTTAGTCTCAATTTCAGAG^ 
GOTrTAGCCAGCCTCTGCATGATTT^^ 
TCACTTATGGTGCATGTCGCTGATaSTTGGC^^ 
TTGTCAAGGAAAGGAAAAAGAACAGTCGTGAGACCAAAGGAIXOT 
Crax:ATTCC:^TTGGCTGAAGCAAAT^ 
CAGCCATACAGCmVGCXrGACAGATGGGCnCAAGAAGATTC^ 
TTGGATCXrrGGTXnGCACCTGGCATATC^^ 
TTACCTACICCTITOSGCCAGAATTT^^ 
TTCTAACAAAGAGGTGTAAATGGACAAGAATGAAAAGCAA 
ACTTGGATGTTGATITCrrAAAGGAGGTGA^ 
ATGAAGGAGGGCCCATCTCTTACGGCAGGKIAGC^^ 
GCAGGTGCGGGTGGCTCTTGGCACATGATGrrAa 
AGGCCATAGTGCAGTGTACAATCAGATTTCATTGCAGC^ 

caagtaggactacagttc?kx:actaccacacc^^ 

CXXrAQGCTGGTCTCAAACTCCrc^^ 
AGCCACCCCACCCAGCCCATCnTAATTTTT^^ 
4:) GATTAGAACCP^CCTCTCTAAGGCATTCAGC^ 
OCTtriXnCATTTTTTTCC^ 
COTCnCAGAACACXrACTCJ^ACTT^^ 
CTCATCSGTTTCTGC^^ 
CCTCCATGOCITrKaiTGCTG^ 
AACATCTCTACCITCTOCGGCX^ 
TGAAGTCCCGATCXXma^TTTGri^ 
T^^*=A^^GGC(XXriXXrrGAAA^ 
CAGTlTCCTCTCaXCTTACA^ 
ACCaTAGTCGGTCTTCTCCACTGC^^ 
AACTATTGCTAGAGATGGCTGCAGAATACCTX^^ 
CGAACTTTCCAGCCX5 AGAAT QCTT^^ 
TlTCAGAAaTCAGAAGTTTTCAAATT^^ 
AGCCCATAGGCCAAATCTOGC^^ 
^^'I^SGCTGCITTCAGCTACGT^^ 

AATATCTCACATCAGAATCAATGAATATTC^^ 
CAAATQCTITATTGAAAGAACTATCTGrK^^ 

AAATAGATGCACAAAGCAGAGGGAATXIAAGTAGAATAACATGGrm 
^^TGCTTATCATATCrrrCAAT^ 
^'^^^^^'I'^^^^CTQGGGTGAGGGC^ 
TCACTICreAAGereACCTT^^ 

TITCAGCTAGAGAGACATAGGTGTATGQCAGATGAGGGAC^^ 

TTGATACTAAOS Cg^ 

TTGACAGAGTTOIViuwiGlTIT^^ 



CranTACATAGGTATACATGTGCCATGGTGGrrTGC^^ 

tittccatagctcittgtgatgcttgc^^ 

cagctcccioxtaatcrtgctagaaggtagggcraat^^ 

acactctagtgaaatictatgtgtaaaataatttgtacaagt^ 

tatittttagaggttgtaagctaagtaataacattctagaa;^^ 

cgctaatgggagatcctaagtttgaggctctgctac^ 

ccgcgactcggtaccix3g1ctatagttogctaaaaagaag^ 

acacgtagggaggctgaggcaggaggatctcctgaagccc^ 

tctactataaatttaaaaattaactgggtgtggtggtgcacatc^^ 

ggcaagagaatcccctcagcccaggagttggaggtt^ 

agcaaga(xccatitcx2aaaaaataaatgaaataaataaatacataataaat^ 

gaagaaatgctgttttaggaaatcaataggataattttri^^ 

gagtgcagtgqcacagtctcggctcactgtaagc^^ 

agttgggactacaggcgcccgccaccacactcagctaaittt^^ 

caggatggtctcgatcicctgacctcgtgatc^^ 

gtgcctggccaggataatcttraagaagaagataacattagctat^^ 

gtctttggcctggaaagggtattggtagcaattgtcto 

gcacacagacacttttgtogttitttit^ 

acaaaatgccatggactgaataatatataaacaacggaaatgtat^^ 

aaggcactggcagattcagtgtctggtgaggatttg^ 

agaggcaactctctggggtatcttttaaaagggcact;^ 

GGGAGGCCGAGGCAGGTGTCrrcGGTGGATCACT^ 
TCTCTACTAAAACTAAAAACrrTAGCCAGGCTCXSGT^^ 
TGCTTGAACCTGGGAGGCAGTGGTTGCAGTGAGCT^ 
. CTCCATCAAGACTAATAATAATAAAGTAAATAAAAGGGCACTAACCX?^^ 
ACCTCCX7^GTCCXX:ACTT(XTAATGTTAC^ 
AGACCACAATAATOITTTATAATGTTCATTTTGCT^ 
ACAAAACCCAGCGTCGTGATTTTGGCAGCCTTGCCATGCATC^^ 
CTGTTCAACCCTCCACATGAACTGGTCAGTTrA^^ 
CCAGCACrTrx3GGAGGCTGAGGCAGGTGGATCAC^^ 
AGAATAGAAAATATTAGGTGGGCGTGGTGGCGTGCACCTGTCGCCr^ 
TGAGCCCAGGAGGCAGAGGTTGCAGTGAGCCGACATOXACCAC^ 
TCCAAAAAAAAAAAAAAAAAGTGACCACACTGTGGCTTTGCTGC^^ 
ACAATGCTCAGTGCAGTGACTTIXXXXSAGCACAA 
TGAGGACGTGCTTC?IX3TGTC?rTCTGAGGAAAGCAGGT^ 
GTCCCXXTOCTCCICAAGCTTGCATGGCGCTC^ 
TTCTGCATGGCKnxrrTriCTGTG^ 

GAACTCCGTTAATGTTTATTTGAATGATGGAATGAAGGTGT^^ 
TCTTTCCXrrAGCATCITTTCATGT^^ 

TATATTATCATTCATATXy\ATAATGTTAACAATTAGTGTTTAT^^ 
TATTCATGTGTTCTTGAATTAAATTrTCATGGC^ 
AGGCTTGGAGGCAGGAAATGACTCACTGAAGGCCAC^ 
TCXTTTGCTCTCXAACeiCTCT^^ 

TGGGCKXriGTAGAAATACXIACAGGTCAGGTGGCTrAAAC^ 

CATAGATCAAGGTGTCAGCAGGGTTGGTITCTIX:^^ 

ACATGGlCTTCXXTCrcTGTGCGTT^ 

CCATATAATGACTTTATT5TACCTGAACTGTCTC 

GGTTTGGGCTlXX:ACATAeAAAGTCIXXX^^ 

cx:gggggaataaagctagagttgctttaatccttgtaatatgt 
acctgagaoctgagaggaggagaaggaagcttggagt^ 

CCCAGCTGACAGGCTATACGCGGACACCTTGGATGTG^ 
AAAGGTQCAACTCCXACTOCTGCTGCAAGAACAAAGAC^^ 
CATXTAAGTGTTAGGCCAGGTGGGGTGGCTCATGTCTC^ 
AGGTCGGGAATTTGAAACCAGCCTGGCXIAACGTGGCAAAACC^^ 

AGTNTATATAAATATATATATATATATATATATATATATATATATATATATGTAGCCATGCATCC^^ 

gtcccagctacttgggaggctgagacatgagaatcg^ 
tgcactccagcxtigggcaacagagcaagac^^ 

ACACACICCAGGTOTATATCrrCAACAAAr^^ 

aataattacaggacccixxx:aacctcctcot 

tgtctctaaagcxagcacttcgggagtccaaggtac^ 

tcatgagacxxxiatctgtatccaaataaaagcaacaaaa;^ 

aggaggctgaggcaggagaatcacttgagcaaaggaggttgaacx^ 

acagagtgagaacctgtctccaaaataaataaataaataaataa 

ctgagttcctgtagcatgtgcactactcctgarraacc^ 

tttictgccaaatggaagatcatcictaattaataa'^ 

gatggtgtacctagtagcraatggtgagtaggatgaataataatactataccl^^ 

gaactggcttaaccaagaagcagcaagctgttttaatk^^ 

GGCCXATGGC-mCTGCTAAQAACCTGCnXS^ 
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GGTTTGACAACnTTCCTCCAAa^C^^ 
AGCXrCXSAAACTTGCnCOnXSGAT^^ 
TAGACACTAGATAGTC3GGTACX3C?rcGGACAGT^ 
TTTGAAAATTTCGCACAT^^ 
TTCAGGAAATATCXTITITITTIT^^ 
GTGAGCTCItXrKXXAAGTGT^^ 
AGTTTATlTCTTCriTrcATl^^ 
OTO^TrrACXXrAGATTGTGTTTCT^ 
^'^t^C^CAACTCAAAGGCTC^^ 
CTC5CTGCCAOTTOGTCCACAAm 
TTTAACTCACACCTCTGAAAATGCTC^ 
GAATTGTACTTAACTGGCATATCATTCIX^ 
TAACCCCCTACTCACO^CCX^CCCAGT^^ 
, ^ '™=ACCTCTCCCCTGGTAACT^ 
i:) 'ICTACTTAGATACACATTICCATTTTACTT^ 

TTQCTTCxi^xnTrAAccT^^ i i 1 1 1 1 1 i i 1 i iTiTi rrrm i 1 i 1 i 1 i-i-i Trmrri ^GAC 

AGAGTCraXriCTGTTCnCT 
CATKnCCTCCCTCAOXr^^ 
GTAAAGATGGGGTTTCACCATCnTAGCCAGGAT^^ 
GTOCTOSGATTACAGGCG^ 
TGAAATTCTCCTCXnTITI^^ 
TC3GCTACCCAffnx?ITAGCr^^ 
CAGATAGTTITAGTAAAACCTrcAAAAGATAl^ 
CIOTCXnX^CCAAGTTTCTAAGC^^ 

CGCAGTCGCTTCATCCTTffrAATfYT^ar^ 
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35 



55 



60 



Z:) CCXJ^GTCGCTTCATCXnT^ 

AGCCTGX3a:AACATGGTAAAACCCX:ACOTCT^ 
AGC^TACACAC^GGCTGAGACATAAGAATCGCTrcAA^ 

actcx:agcctgggtcacaagagtgaaactccatc^^ 

CAGACGGACCCTAAGACGGCCCCAGTGATACCCCT^CCTCCT^^ 

gaatgtaggtgggacatcx^gacttacttctgac^ 

TACACrATATGCX^CX:AAAGGGATTTTGGAAATACC^ 
GGGATCCTCXirrGGGTGGGCCTGAOT 
TTGCTGGCTITCAAAAAGTAAGCTGCCTT^ 
GCCAGCAAGAAAATGAGGACATTAGTCCTATAACCACAA 
TGATTCACCATCCCCAGTCAAGCCTC^ 
GGATGGAGAACCCAGCTC^CCATGCX^^^ 
TTTAAGCTGTITAACXrmTGGTAATI^^ 
ATCTlTCTTTCGCATi™^ 
'^ATAAGAAACCTGTACTCrnxXO:^ 
4U AGATCACCTCAGGTCATGAGITNCGAGACCAGCCT^^ 

acxx:aggcatggtggcgcatgctictagtcccagccga^ 

CAGC3GOTACAGTGAGTCXX3GTGGCAGCTGai^ 
GTCACTGCACCnXAGCC^^ 
TATTTTTCATTATTTTXXACTATGAAT^ 
4D ACACACTTCO^CCTACAGAGTCACATAr^ 
AGTCrTTGCTCTACAGGTCAAGTTGGGAOT 
TATAGATGAGGAAAGTAAeAACCCACAGCATCTTGCA^ 
TAACAGGraLTATGATTTCC^TTTGlTT^ 
TGAAACCTOmTTATTTATTTATT^ 
3U. TGGGTQCCCCAGCATTATCTCGTAAAGGACAT^^ 
TCACCAAACCCTAAATAATTGTCTGGATCn^^ 
TAGAGTQCATTAACTTCTTTATCACATTAATACI^^ 
TTGCXXriCACAGGACAATATCTTTAATAGCT^^ 
ATAGCATACTCAACCCATCTAAC^^ 
GGAAAAAACCCTCAOTTTATTGCAGO^GATl^^ 
TrcrcCAGCACCCTGGCTC^^ 
TCTOGCITGAATTTGTTIXX^ 
TGTTCACCTGGAGTCTGTACACAC^^ 
GGTTAATCTCCCCGCTCACCCT^ 
GCACTCGTTTGATTTGTAGCXXSCCACAC^ 
AACTTTGCTGGAGAACATCAa:AAlx::ACGC^^ 
'^^^^^^^^^^^^^^^^^^^ 

CrCACTAAAGGGGAGGAATCCCTGGAGCCT^^ 
TTCAC>2CCCACACAATCCACAACCT 
03 ATATITTAITITTTCAAACAGACTTCG^ 

TGTCIXnAACAGCCAGTTTATCCTATTAr^ 
CTCGAAAT^OTAGCT^^ 

ATAGTAAATGTGTTITTTGCTATTAAAAGTAAT^^ 

GAAAAAAAAAAAAAWAAHRRMAAAAAYTGCCAAAAGTAATTGGCAAAIX^^ 



GGATTACTTAAAAGTTTITAAGTAATGCAATTACT^^ 

taaaagtaatcgcagtttttg€x:attacitaaaa 

attgttattaactaaagtxx:atcgtttatcata 

tccaggagcccatccaggataccactittatrtattt^^ 

ggagtgcagtggoxxsatcttggctcact^^ 

caggtgggaccacaggcatgcgccaccacgctcagct 

ttagcatcttcttggctgtcacagttt^^ 

tatktictaggatgcccxrrgtgttgggatttt^^ 

gaagaccccagaggtctgrmxit^tcacatcgta 

ctgattcacctggtcaaqgtaglxmcgttaggttt^^ 

CCirrcGAGGAGTGTTGCICTGCCCAGC^^ 

CTGCATAAACTACTIGGAATTCTACATGGAATAT^^ 

CX3GTACAeTCATGGACM?TTATTTTATGCWrATG^ 

CCAGATGGAACAATTTGGTCTTCAAAAGTTCCT^ 

GTGTTTTAATCTTTGGTTTTGTTATCACT^^ 

ATACACAAAGGGCmGCATWGCTACrATAATTTTXrrGGAAT^^ 

CATACTAGAGCAGTAGGCAGTGTGGGTGAGTTCTCXXAC^^ 

ACnriTOXXXriXnGCCOCAGGTGTG^ 

tctggtcttaaagtc(Xaaagggaggcx:atcacaaa^ 

ccaaacitgggcgattcagtttracctgtch^ 

aaggcctcttccaatcttgacattctatgatc^^ 

TGAGAGrnCAGGTGTTXnXSTTGGAAATC^^ 

GACACATAAATAGGAAGACAAAGAATGAATCTTGGACTGGGGAGAGAAGA^ 

TGAIXX:AAATCCTGGGAAGACnCTC3GAACAAC^ 

ACAAAAGGTGGAGAGTAGAGTATGCAGTGATTTCAGGGTGTT^^ 

TCAACATCAAATGTCTGGATCAAGAGAGGCAAGAATCa^^ 

TCTGCCrrcAGTGTCACCTCTT^ 

tgaagagttcagtaaactgtctotaacccaaggaaagatacac^^ 

agataorggccotacagtggaatgaaccttcacaccgaggatc^^ 

tgcttggaattgtcatggctgcxrgtctgagtgat^^ 

tctccctcatgtcacattgagtggtgctaatgtttcctc^ 

gccttcxzagcagaatccccritctttgctct 

gggttgggggattgtgggtmxx:gagaatcagtatttctatci^ 

aagtacccacctaacagacttgccagctccrtaaggataagaactatau'luu 

aaggataattacacagaataagcyvctgagatgcctgritcatgg^ 

gtctgg?^cittgcaaaataaaca(x:atggtca^^ 

aattcagggttaatttcttgcrrtcaaatgaato 

CGACX^ACTAGCCAGGCGCTTCCCTTCCAGCTG^^ 

TGGTCCTAGGCCCCrTCCrcAriXTKXIATOT 

CAGAATTACTTTCCATTCTCAGGTCTTTTATATAAGTAT^^ 

GACCATAACTGCAGTGTITTATTGTGCTrACATATTTCAAAAT^^ 

GAAAACTTTTCAAAGGGCACCACTGCTCGC^ 

TCTGAAAACAGGAAGACTTTTTCTCTTGGGTATTT^ 

GAGGTTAGGNCCTTCCTITGTCTCCCAGGAGCACACT^ 

ATATATAGCAGTGTTCTCAGTAAGATTTCTCACTWSGCAT^^ 

NTTCAGGCAGAAGrrccaaAGTCATTTClTCT 

TTAATCXXrTTCACTGGCTCACCATATCC^^ 

CTCTGCAGCTTACTTTGTACCACTTTCOCCAGA^ 



TCACTTTCTCTGGGGTGCAGGGGTCTTTCATAT^ 

AAAAAACTAATTTAGAAATAGAAAATTAATTGAGTTTTAGITrTTAGAAAA^^ 



GCAGGGATTTTAGTCrrATTTCATTGAAAATCTC?rc^^ 

gaatgaatgaatgaatgaatgatgatx3gcgtgatgattgtgt 

tggggixxtattttgctcactgttgctgtgtggga^ 

tctgggaacttggaatttaggagggcotgtcac^ 

accaaaaatagggctcatgtggaatacotsct^^ 

tacgtggccaacttcx;aataaaaacictggac^^ 

riCACAGCTCATTTCriXXSGGGACT;^ 

GGTTTOITrcTCCAGACTTCGCCTCATl^K^^ 

AATCTCAGCCATGAGTGTGACCGTATGCTGAGTC^ 

TGCCTGACACAGTTGTATTCCTAAGACATAGTT^^ 

GAAAGACGCCATACGCAGTCACCCCrCOCTCTCT^^ 

GAACAAACGGAGGGTGGTATCTCTACreAAAATCTACCAAC^^ 

ACCACTATTTATATXrGTGTrTATATTCTATTAGGTATTATAACny^ 

ATACATTATATGCAAATACTATACAATTTTATGTTAGGGACT^^ 

GAACXAGTCCCCTTTXSGATACTCAGGGA^^ 

ATOCACCCCTTCCTCACCCAAGATTCTC^ 



^AGAGTAGTCTTCCXIAr 



^AGCTAATCAACTCTTTCTCATCATTTAGGC^^ 




:aaagtcagttoaattttccctitkxagcaa^ 
^aggacttacaatgtctcactcatgtagact^^ 



CTCAGACAGA(Xn>GGACAGCnCCAGCATCCAT^ 
TGGCAAACTCCXX5AAGAArrAGGACAGATGTA^ 

Ga»AGGCTA(nxrTCTAACTCCTt^^ 
CCACCATAOCCNAOXAAMTAr^^ 

in fl^^^ACT^SGAAAAATAAAAC^ 

lU TCTlCTTGGTCCATGAAAT(^JVf5aaaf~iYv^»^^i>/^...->™,.--^ ^^'ji » lAAAAAAAATAAAAG 



TCTCAGCTCTirrrATAT^ 
GAATi^l^TATAATItrACTAAAACAOGTAG^^ 



15 TArriWITCAAGAGGCnX3TAG(^GAAAGGG(nX3GGAG^^ 
TCA^CCAO^GACACTCTCT^^ 



TAGACnTCCX3ATTTCTOCAT(OTX3GTCAGAC^^ 
20 AGTOClXXSGCnrcCAGCTCn^ 

AOnCTAATAGAAAGGTAGGGCCATXmGGAGAaSGAlCC^ 

tggccag(x:atcgt^ 

TGAGAQCAATCTCGCCAACATtXnxaAACCXaxn^ 

GCAGAGCAT^ 
A^AAAACAGAAOCAC^ 

GTCAAGTG^ 

GGTCAACTACa3GAA«aAGGGAGGAACCATCTGAA^ 
^T'TGCTCACCAAAGGCTGA.Tr^ 

CAQCTATATGACTCTGTACCAGAGAGAGACCTIT^ 
AAffTATCTCATrATAAATCSGATtX^^ 

^^^'^'^'^^^'''^^■''^^'^^^ 
CGOCTCTAATCCCAGCTATTCAGCy^ 

ATTCCQCCAC^^ 
TCGATTCTGCCGATAAACATCATCTOS^ 

l«3ICTITAGGTGGGC™GAAAAGTCXri^^ 

™TAAATTCATACCNAACTICAGCXCCAmrmx^n^^ 
O^TAITTCCCCCTCTIA^ 

S^^^'''"'^'"^'^'^«^^«™AATCrAAACOT 



TCGCTTTIT 



ATOAGGGGACCKnXXXSAC^^ 
AGG^TOffreAAACATOIACAATrAGAACS^^ 

03 ATTCAAGGCmxnrCSAAAaSATGAAGTCCTAAAaATYV'i-TY-^ '^'^^"''■"^"ATIICTTr 



^OGATGAAGTGCIAAAAAlXXnTCAC^ 

ClTGAGCTCX5ATGTCXy.TC^^ 
raTCAATTGGQGGCAATreGAGaXWiAA^ 

^^^AAAAlXXXnriX^ACAG^^ 

TC3GAAAOC»G(nTATTATT(aVTAAGGACTAGACATATTAGAGACCAGGTO 
GG^G^AAGQGTT^TOX^^ 

'=°^'=<=^™*^™AAACCAAGTTO3a^^ 
ATOSAGGriCTGCTT^ 

AGClCAACAAGTAAGAATGGACJIGC^aT*^^ 



GCATTCAACCICCTGTTAATTTGGTTGCTC^ 

AGACCAATTGTTAAGGACXAATCTGAGGGCCTGriXX^ 

TGGACGCGOTCCCXriTGTCTGTtXrAATTCC^^ 

ATCCCTGCTCATCTGCAGCAC^^CT^GGAATGTG^ 

CTGCATTTTCTAATCTTCACAl^T^ 

GCGGGCAGATCACCTGAGGTCAGGAGTTTGAGACCAa 

ATATCrrACTAAAAATACAAAAATTAGCTGGGCATGGTGGTGGGGA(^ 

GAGTCTCnTGAACXXAGGAGGCAGAGGTTGCAGTGAGCC^^ 

AACTCTGTCTCAAAAAAAAAAAAAAATTCnTTCTGC^^ 

CCACAGTGATGTAACTCATCAGTGGTXrCNACCTCA 

CTTTCCimT^CrcANCTGACNATCAT^^ 

GTATGCTCCTTGATGAAGGGCTCAGAGCTAAGCGCGGAOT 

CTTIXXTTGGCAGCTAAGAGTCTCAAAATGGAGGGAT^^ 

CCTATTGTGTTCrrGTGTGTCTCCAC^^ 

CTTCCTTGCAGGGTTCCNCnxrCTTCCTCT^^ 

CCCmGAATCACTTTCAACTTGCTCACCTGT^^ 

TIXriXrrcTGQGTTCTCCGTACCGNC^^ 

OTATCAGGCAO^ACCCTCAACACTGTT^K^^ 

ATTIOTATTCCCATTTTGCAGACAAGCCAACG^ 

AGAGTCAGGATCTGAACXrCAAGACATAaxrCAGGAGTC^^ 

TTTTCATTCCXriCTCCTTTCCCC^^ 

TTTCCTCriXX:AAGCAACATCTGAAAGGCCCAGAC^^ 

TTGTCTCGCCTGTACCIXXXrrr^ 

GTTGACACCAGACTAATGCTnrrcACTTTTCC^ 

agagatagagcacttcagattcxactcgaatacattagaggattag^ 

gggttgggaggagatgtggcatttctaggaattcacagt;^^ 

gtgaagattcaggaattttitttttcitctat^ 

GAGAGATATGAGTGTAAAACAAGAGAATGATTCCCTCTGGGGTGC^ 
GGTATCAGGGCTGCTTTCCTTTATAATCTTTTAC^ 

ctgggcccaaaagaccagtctcctgtcccatacttggacattta 

AAGTCTCTTGAGTTTGGAAGACITCACICCCAGC^ 

CATCAATGACACATGCAGTGTCGGCCCAGGCATCGATGACACACG^ 

GGAGACCGGCACCTGGGCTCACACCrTCCCGTATATGACT^^ 

TCTCTCACTTTGCAGATCAGGAAATGGAAGTGCACAGAAGGA^ 

TGAGAATAAACCCGGAGTTGCCTCriGGGAGGACCTGGT^^ 

GGCACAGATTGGATTTTTTTTTTTTCCAAACAAAGT^ 

gctcackx:aacctctgcctcccaggtt^ 
ccaccacgcccagctaatttttrgtcttttttagtaga(^ 
gacctcaggtgatccacctgccccagcctcxcacagtgl^^ 
cttctgtgtgaggatgaagagtaataggaagatagagtggtgatct;^ 

TCGCAGGGCMCAGCTGGTTTCXrrTTTTGA^ 

GGAAAGTGCCCAGCAACCCCAGGCAGAATCATTGTCTT^^ 

GAGAAGAAATGGTTCTTCAGGTTTGGAGGGAGGCCGATAGAGAGTGT^^ 

GGTGTGTGTGTGTGTGTTTTCriTCTTGG^ 

GACTTTTAGCGAATGGATACTATTGAAGGAATGTTTC^^ 

GCCACTCAAGTCGGCCGCTCTCAGCTAGAATCCTQ^ 

ACGTCTCCACATTCTGCACCrCCCAGGCT^ 

TCCGAAACAACCACCCGTGCTGAATATGTAGGTGATACGTTGA 

GTAACAAACCTACACATCCTGCACATTGTACCCTGGGAACT^^ 

TCACXXXrrGTAATCCTAACAATTTTGGGAGGTCGAGG 

GCAACACGGrrcAAAACCXrCGACTlCTACTAAAAATA^^ 

GCTATTCGOSAGGGTGAGGCAGGGGAATrGCTTGAACCTGGGAAGCAGA^ 

CTCCAGIxriGGGTGACAGCGCGAGATTCTGTCT^ 

AGGAAGGAAAAGGGAAGGGAAAGGAAAGGAACCAGCIXXAAAAATTGTC^ 

TGAAACACTCXrrCAAAGCTGAGAGGAGGGTTTGGGAGAGAAG^ 

TIXrrcriTCCTTGGATTTCAAAATGACCCCC^^ 

TTCCCCCAGGAGATTTTTTTITTCTAAAGGAAAAGG 

CAATAGGACCCATCCTGGAGAGAAAAAG(Xn*lV'lUTlTAATTAG 

GTAGTGTCAGGGTTATCAATATCCGTGGCTCriT^ 

CCCCACCCAATCCCCKTICACGCCCCAAGC^^ 

TCAGCTCATTCTVTTCTGCTCGCCCTi^ 

ATGCCOXXriTXTrGGTTAACCCAAGCAAGAGGr^^ 

TTATGTATGGATTAACACAGTGATAGACXXrrcAGTAGCAAW 

GCCTGGGTCATTGAAATrAATTTCTGCTTTCAAGC^^ 

CGGGGAGCCGAAATTATTTCTGTGCACAGGCAGrrAT^^ 

AAAAGGATTATTCCTCCXXTTTTATCAAGAGCCCGAAAGT^^ 

GCAACGAAGITATTGGAGAGTTAAAACGTAAAGACTCXZAGTA^ 

CCCCAAATAATTGACAATTTCICTCCAT^^ 

TCCAGAGATAATTTGTGAATATCCATTCTATAGCATCCCAT^^ 
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TGTCTCCAGATTCCnTOrTCTGTCXX^ 
TGGGCTXXXTOCTITrATTTaSGCCCTGAT^ 

TCCCTTCTaTGCnxnXX:ATG(XA^^ 
ItXIACAlCAGGATACCTCAGTGGGGAAC^ 

ACTGTCXS«3CICTTrAGGATTAGGCAATACTTT^ &Tvv-Pn,^nv^» . , 



^CATCTCrmCTA 




AGAC(XAAGAaaTCCGACAGCTCTCGAACTIX?rA^^^ 
/^GGACACCTATCSCCTAGAAAACATGAATAATTA-ICA^^ 



CCAA(nCAOCCATGA<?rAGCAATOAC3<XXaCTTCTAA» 
ACATCTCAITAGCACTCCCCAGAGCCTIC^^ 

CGOCOTCCTrACXriTCTA«3GAflCTGATTTAAAA{X^ 

C?^GCT«3GTGCG(n«XTCACG0^ 

T^GACCAGCCTCACCAACATC^ 

TOTAATCCCAGCTACTCAGAAGCKrn^^ 

TACTTCTCCCrc^^ 
TATGTCACCTCXXnCCTCTATCXT^CA^ 
^^^^2^^f^«^^™^CCAAAAA0C^^ 



30 

TOX:ACCATCnTTACCAGGCTGGlCTGGSU«^^ 
^CAGOXn^GAfiCCAClGTa: 

GGCTTAAGAATACATTAGTAGGTACAGCAGGAGAGCAICTA.^^ 

35 ■n«3TTCTACTCTTCAATCaGTTTAAGATG^ 
GCATAGlTAAAGlTCaTTAAArKn-I^ 
CTTACCAGAAICTATCAAGCTCITI^^ 
COCTTGAGG^ 

40 ^Cl^GGTCAAAAGTXXZAAAATC^^ 

TCAG«:A«nTCTACTCATCXnxa3MA(nX^a3^ 
AGGGGAGGO^AGCnXXXZA^ 

ICTrmAGGTTTATITGlXaOC™ 
50 GATCOCT^^ 

CCTTCITTOCTTAGCCTCC^^ 
CCTCAGCCTCT^ 

GTITCACC7iTG?ITOGCCAAGCTGATCTCAA^^ 




nz-iim. .^,>™.,,™™ i * ill 1 imriuwrTTKXXnXXTTCTAGAACTAGA 

ACA^AGCTCT^^ 

a»TCXXTTCnGGAAGTGACAACTCTAGG(aon'AAC^^ 



AATC^TCAAOCAACATGAGTCACTTGCTACT^^ 



ATT^CACAGATTAGCACATCGTr^ 
AA^GTCGGCAGATTCC^^ 

CAACATCAGAOCATCACCTTCTTATACCrcAAAAT^^ 



CCAGAGATCAAACCCAGATCTAACTGGACCCAAGACAGAAAT^^ 
GTTATGTCAAGTGGAATATTTCTCGCTTTGGGG^ 
TATTKXrrATTTTTCTAAAAGGCAACATTAGTAATCAACT^ 
TTTCCTTTXIXXrrTAGTATTraXATO 

AATCAATACATTACCACCTCCCAGGAGTTGAATGAAAAACG^ 
AATTCATXSCATATCCCTATACAAAGACCXIAAGGTGGT^ 
GAGGCCCTT<31TCTAAGGAAACXXriGGlXXX^^ 
TGGGGAAGAGTTTTTCnCAAGATCANTTTTT^^ 

atcagacacx:aaagatacacatggacaatggccangacci^ 

gagcnagtcaactcagaatggtcaggact™^^ 

aggaccaaccagagaaagccacatatgcrccxxaaaccaat^ 

TXXXXlATGTCCACAGCXriXXAATCGGAGC^ 

TCTCCACCTCCTTTGAGCCTCTGt^^ 

CTCTATITCTTATTATTTCGGTCCCCTri^^ 

TTCCl XjnnnVj ' IHUi;i ' i ' I GATTGTTITrAATTAATAATT 

CICTGTCACCCAGGCroGAGTGC^ 

GCCTCAGTCTCCOVAGTAGCTGAAATTACAGGCC^^ 

GTAGAGACGGGATTTCACCATG1TO5CCAGGCT^^ 

GAAGTOCTGGGAtTACAGGCGTGAGCCACCGTGACT^ 

TTTACCAGATTGGAGTGCAGTGGTGCNAATCCTAGC^ 

CTGCXrACTTGAATAGGlGGGACTACAGGTGGGTACC^ 

GTTGCCCAGGCT(X?rcTTGAACTCCTO 

GAGCTACTICCCCAGGCTCTTaXX^ 

COTTCCCTTCTGrCCIOTACT^^ 

ATTTCATGTCnTAAGTCTACAGTCTATAACT^^ 

CICAAAGCCTTGATTTTGCAGGTCTGATTG^ 

GOVCAGTAGTGGAAGCAGGATa:AAATCTGTGTCTGT^^ 

TATCTGTAATTCTATQGAGGGAGGTTCCATTGTATATTTT^^ 

TTGTACITTATCTGTGTAACICTCGTAGCAGCT 

AGGGAGAAAAGCAGGACAAATGAACGCAGATTAAGGAAAATAAAAAAGAGTGTTTG(^ 

A1TTTTCTTIX2«'AACACATTACTACATACAGTGACC^ 

GAAGTATGTGTATNCCXXrITITTGTCTTT^^ 

TCTTAAATTCCATAAGCTGGAGACJCTGCCCC^ 

TCCGTGTGCCTTGCTT(XTAGGCTGTCCACCA 

CCCAAATCTAAAGCXrCGAGTCGTATGAATCCAGGACACCTGCC^^ 

GTCTTTGAGATTAlGTAArrAGCATATGTTGTCATAGTGAl^ 

taatacttttaaatcagaggccaacatggactttt^^ 

ggaatgggttoccatgccatgtctccccctgtccc^^ 

agcx:agccacgctgatgtagacagaatatttcggatttatattattg^ 

agagaaaaggtggcttggctgtttggtttatgtcct^^ 

aagcitaagaatgataaagaattggcitttctgct^^ 

ttcaactggtogattcacgaggagcsttagacc^ 

gccacaaagttttcttatatittaagcatcaaagtagcac^ 

gtgggcatccctggcatgcttaaacaagaaaggcaaatcagaga^ 

CCAGAAGGCTGTGGTTCXrrTrGTCTCT^^ 
GTTGAATACAAGCAGCTTVTTCAGTGTTTTACT^^ 
GGAAGTCTCAGCAGATTGTTAAAATQGTATCAAAGATQGGCl^^ 
GTCTAGGGCTAAACriCTTGCCTTTACXnxmxXA 

cacacagacccagcica3x3tccccag 
ataactcaggagtaggagggcagcagocx:aaatcttt^ 
cactxntggatcsggaatctggcttctaacaaac^^ 
tcatgtctagaatggggactattgctagctattccc^ 

CAGGGATCC2mXXX::AlTATNGAGTT^ 

TTATIWOGCATTCCTAGGNACTCSTTT^ 

TTGANGACCAGGGGTCCTQGTCACTTAAAAGAT^^ 

GGGAGQCTGAGGTGGGCAGArcACG3W3GTC^ 

AAATXX:AAAAAaTAGCTC3GGCGTGGrroGTGAGC^^ 

CAAAAAAAAAAAAAAAAAGAAAAAGAAAAAGAAAAAGGrnmTOGAAGCXATGT^^ 

ACTCTATAAAGGACTA1AAAAAAAGTI\3TAATAAATTATGC^^ 

GTATTTCTCT^GCGTGTTCGGGGTC^^ 

ACCACCCATGACTCTGGATTTATCATCXrrx:^^ 

GGTTCnrCnxnGATAAAAATAGACTrcGTAAAGA;^^ 

TCATCTTCCTTATTCATCATATAACTAAATTAT^ 

TCACAGTTCTGGAGTTCACATATCATTO 

catggtagaaggagaag^u^tgaagaaaaaittagaaactt^ 

<x:atgagataatcactcttaatattttqgtqc^^ 

gctcgtaatatataggcacatagatatataattaaaaaaacaagaal^^ 
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AmrrAGTAATITCCTAA(»TTATACCATGOTAAATGAATTCTA^ 

ggccati^atoatitcggacacta™ 

WAAAGTOA^ 



GCTn^ATirrAATA^ 
TAAGATTITCaSWJCTCTCAAlT^ 



ACATT^TCTATGT^^ 
GTAITITOC^ 

CT^I^^rGAAACTCACXnNC^^ 
A^™TM^TAAC(mX3GC™ 

ATAATCHTTTTAGAAOTAGAAGATITAAATATATCAAT^ 

AGaSGGAAAAATAAAAAAAACACAACATOXJAGTGAGGGTCTAGa^^ 
GGTCGGTAAlTCriCTCTCl^^ 

C5AGGATGAAAC?rTATGCTCT(mXOTTXrrcATAflCAT^^ 

CACTCCAACAaXX^ATCTOCTTTCTC^ 
TGAGGCTGGCIAATCTTATAAM 

GOGTAAGGAAACTTANAAATOJATGGCAGflAGATCAAANGG^ 
ACA(X:AAGGGGGAAATCAGCOCTCTa^TCCAAC^ 
^^CAAGGGfflTCCT^^ 

ATACAOC(XnX3CTCTGACCAax:ATOGOGTCT^^ 
ATCaGOC?riTI«nTanCTACT^ 

TC^COCTCGATGAGA^ 

AATOAACAAGAAACTATTAATAAGfl^^ 

GCTCTAAACACTlXXCATGTAGTCaCTAATITAAaar^^ 



^>XC™_r' ."««x«»*_i iViA«fllJWJAGAGATTTATTCTCTCACaGTTn^^ 

GCTCCCArrAATOnTGACATTCGTCGGCT^^ 



CAITITACCITAACAAGTTGNCATCTGTGAA 
CSCTCATCGCTCTAGICCCAGAATT^^ 
CAAAATCGCAAAANNCCCCrmnCTATTAAAAATAC^^ 
5 ACTTOGGAGGCriGAGGCAGAAGAATCCCTTG^ 

CAGCCTOGGCAATAGAGTCAGAATCCGTCTCAAAAAAAAACT^AAC^^ 
GCACACATAAATTTGGAAGAAATXXriOmTAACCCAGTA^ 
AATItSAOSCACAGACAGATTCAATAACTACXnXS^ 
TTTCAGTKXZACAGCCTCAGCTCCTAAC^^ 
10 NCATCTCGGAACronTAGATOKXr^^ 

TAGGGCTCAGCAAICTAGTITTAGCCCTCCAQGGGATT^^ 

GATAAGAAAAAAAAACCTTTCGAGACATCATITGAATC^ 

TCTACTACTCATCTCTOCCTX^ 

TrTTACX:AGTOCCCAAAAGTCTTAGAAAAATGACAAT^^ 
15 AGATAAACSGGAGAACAGCATTTTCTTTTTTrAC^ 

TITCAGAAGAGCAAAGTCXX3GGTCXXX:AGGCTC^ 

TAACATOCTCITAGTGCATGCAAAAAAAAGCCCAGTAATT^ 

TATTATTTTACCTCAAAGCAAACACATCATTCAC^ 

CTTTTTAGCAAGCTTTTTQGATCC^^ 
20 TTGTTTTTAATTACAGGriTATATAAAATATTGri^ 

GGACTCGCCACntJIGAGCCCATCAerAAAATG^ 

TTOATTIGAaSAGGTATXnTGTQG^^ 

GGGGCXACTTAACACAQCTTTCATTTTTCj^^ 

AAAGCKXTTTCTCTCnTTT^^ 

25 TcronGTiGcnTxxxsGccran^ 

ATAGAAGCTCAGGATAAAOnTTCAGACClT^ 
GGCAGCCATCATTAAAAGTTTGTTATGGATTCTT(^ 
CTWrCAAAAAATCCATCCAACTGAGACTATTGGCTAT^^ 
AGATTATlTCGTACCAGCACATCTAGCTCy^CCTCCTA^^ 

30 ATATATTCATTCCTATTTATAGACATTGAAGTTAGATCCAT^^ 
CATCTATAriTCGTCTGCTTGAGCAAGGATATaGGGAGG 
GTACACCTTTCAAACAATTAACTTCACCAAA^ 
TCACACGTGAGAATGCCTGTCGCCCCCCAGAAATTAT^^ 
TCCCACAAGOCAGCAATCAGCACGCCATTTTTTAAAATTAAT^^ 

35 TTCATAACACTCTGTAGT^XAATCCCTAAGCTGGGC^ 
GACTTTATCTXXriT*AATACAAAGTGGGATTGAACAGTT^ 
TAGCriTTCA(XX3CGGCATCTTCT^ 
CTGTAATAAAlGTCTCTTGGAAGAGATClXnTX^^ 
AAAGCTTCAGGGAAGGGGCAGTGCAAATrGTTTCT^^ 

40 CAATCGCIX^CGCCCGTTAATCXXACCGCTITXS^ 

CCTCGQCAAACAGCCIGAAGCAAAACCbTCATCT^ 
TGGTCTGAGCTACra^SGAGGATAAGGTGGGAGGATCAT^^ 
CCACTGCACTCCAGCCTGGATCATAGAGCGAGACCAT^ 
GGTGGTGGTGGTAGTAGTACTAGrrcAATGCCCTGACT^^ 

45 GCACAGTGGGGAGCTACTCTGAGGATGAAATGCCTAATAAATAAA^ 
TXXTGCTATX:nGTACATAGAAGAATGTTTTAGTC^ 
ATTrCATCIXX:ATATAC3VGGACTGGAAAATAAGA<^^ 
TTCCCACTCAGAGAATTTTCrrAGCTOT 
TGAGCACCTACCATCTAGGTCiCTCATCCX^GATAC^^ 

50 GGCAATXnXSAGAAAACAGAAGGAGCACCAGGCAGTCGTGGGCT^ 
AACATACATTCATXIATTnXXriTICTT^^ 
GGTCATIGACTCGGGCTGGATTCXXSGAAGGATCCA^ 
TCCKXSGCTCTCGGACTCAGC^^ 
CGIXX3CAGCTCAC1TTTGICCAAGTG^^ 

55 AACTTAATCITCGAAGTCACATCCTA 

GTOGACAATIXMTCGCITATCGTATAGTCAC^^ 
CCCCTAAAAGAAACCCTCniACCCTTC^ 
TCTTTCrATGACTTTACCTATTTCGGGTA^ 
AATtmrrCAAGGTlTCTGlTATTTCATCTAT^ 

60 CATXXXXrTTGTATOGCCAAATAATATTCrrATICT^ 
CCACTITCTGGCTATTATGAATAATACTGC^ 
CGTTCCX:ACAGGTOXriTCEGAT^^ 
TATlTTCATIXXAATATCnCTTTATGGCATGTC^ 
CATCATTTAAATAAAAATAAAGTAGGGAAAGAAAACAGATAGTGAT^ 

65 GACATITCAACTACACCCTGAATCAAGAGAGCC^ 
AGTCCAAAGNGCXTCTCGGGTOGGAAAC^^ 
CAAGGATAATAGACAACGCTATCAAGAAGGGGQCCCAACC^ 
CCCTTCACC1TATGGTGATCACTATIX3G^ 
CATCGTTCCCATCAGGAGGACAATGCGTAGG^^^ 



AAGAAAGCAACTAAGGTGACCAGGTGACAATAATGGAGATfl^^ 

TGACTOCOV(nGTO3CXTC3axa^C(X^ 

GGATC»TrACqA«XnOTACCCTCAAAGGATX3CCCTO 



TCTX7ICTC3CCCA1XX»TrATGTACTCCCTG^ 

TAAO^GTrAAAAATAOSAAAATAGATITCMGGCSGAAAAAArn^^ 
TGOCJrAAAACQGAATTTGCCAAACATGAATAGGCACTTACAATO^^ 




^..-«^™^v:„ . ™.^.™Jn^J_v.^J^HJ^J^,«, 1 1 WlVaOiAATTTTCTGTQGTGGTCCGGGTTOCCTCGAGTO 

AGCTCnX3ATGACCGCX:AACACATTICCAA<nT^^ 
CTTCCCACGCACATCCTAAGOSAGTGTAN^^ 

AGTAGGGATATTCATAACTtnxyvATTTf»aTATaarT'a'i»T-Biijin..».nTv.».,r,,.x™,, 1 




„ 1 J l^^"lUACAlvrGATITCOT^^ 

TGa3CXX:AAGAGA0GTGTGTCTCCTTGA^ 

TACctacarKacrca^GTArrccAGAiT^ 

CCTGAGCTtavCCCAAGACACACATCWAAACAlTftAT^^ 

GGCTGTAGTCn-AGTGATrcCAOTCrcAGCT^ 

TCAAGTAGCTGGGAlTACAGaaVTCCACCACC^ 

TTCXXCAGCXTCCncrreAACTCCTCa^^ 

^OCGCTGOGOCT^^ 

(OTAAGGCTCGAaSATOGCTreAAGCCAGGAGl^^ 

mAAAAAAATTAGCCGA<?roTGGIWMX3TGCCACACT 

GCOOGGGAGGTCAAG«rreTA 

CTTTAAAAATAAATAAATAAATTAAAAAGACACTCauSAriTCGGG^ 

TGCCATCGGCATGOCTAGAGGGATCCCACnriTACACA^^ 

TTACAGOTAi™ 

TTOTTATTTACTCTCrrAGAACCATrnaiAT^^ 

CCTCGAGICACTGGAAAA^^ 

AGAGC^GTCCAITAATIX^^ 

TTCCTGGAGA(aTTAGTGATTGAAAGGAGGAGATCTAGAAAGAAG^^ 
GCGTTCTACX3AAGGCriCCTAAa3AGGCTCO^TACTAAATl^ 
TGATXnCCAGCX3GGTGACAGTTCCATCACAAAAG(XiG<nTT^^ 



''-''""'-^inftulV^li/WjOCAGCTICT 

TGAlXnCCAGa3GGTGACAGTTCCATCACAAAAGGG(^^ 
CTCTPJAC^^ 

•irivivJlACTTCXriTriTAAATICAGTCACGT^ 
ITCCTACTAGCl^ 

GrnXZAGACTATCTTICTAGAAGCAGAGC^ 

TTCAATGGCnTATCTCAATATATATCTACTTTT^^ 

^TGGCTCACG^ 

^JACAOGGTG^^ 

ACTCGGGACaCTGAGGCAGGAGAATG(X:ATGAACCTaX^ 
"°^?™=«50GACAG^ 
TCATOAGAAaCCXaVTOACTCTO^ 
CTCGCATCAGC^^ 



"TGfttSflGQCCGAOGCGGGTGGATCACTTGAGGTCAGG^ 

CACTCGAACXnaxaAGOGGAQ^ 

TTOTCTCAAAO^ 

TCCTCTTOOSAATTO^ 

AGATTCCAOTCT^^ 

^AGAACA^^C?^^ 

AC^GGACTIOTAT^^ 

AGAAGAGCAOXaTC^^ 

ATAAAOGGAAGAAAG^ 

^CTCTIXXTGATGAGfl^^ 

GOGAAOCICT^^ 

*^*°™»°°™5ACCTrcAGAACGGC^^ 
TACAACTaaM3CTAAGCNAGATG(»TAAGa:AAAAACCAACC^^ 

GATATaSCOClTOXriCACATTOX^^ 
^^J^^AA^^^^G^A^^ 

A^AA»^;«CT^J^^ 

CCACAAAGAGlTAOrrACCXXXSAAATGTCAAa?^^ 



AAATCAGTCAGTTTCAGATATCCAACAC^^ 
AAAACTAGTTACAGTTCATK3GATAATIXXXrmCATCC^^ 
ACAlXSGCACCOCTATTGATTATTTCAGTEaACTCAGAA)^ 
AAGGAAAAATAAAGTTTCTCIO::ACGTTA(Xri T A"riH'lC^ 
5 AGAACAAGTCCronXIAAAATGGTlXXXnTT^^ 
TTTTKnTTTTOCTCCAGAGGCAGGAAGAGGG 
GAAGCTCCATCTTTCTrrGTGAGGGCTTX::^^ 
TCCCTTlCACCAGCAGGTCGAGbKriCTAACX^ 
TCTICTACCCAGGGTCATATCCTTGGATX^^ 

10 ACACXrrrrcCAAGCTtKXXXIAGTGACCC^^ 

ATXXrrcCATTCGTGTGTTACGTTATGAGAACTATTAC^^ 
CATCTCACCCTCTGAATGTOSGTTATGGGATAAAT^^ 
TTACAGATQGTTGCTCTACTAGATGAATAATGTC^^ 
GGAAATAAGGCCTTTTTAAITAAAAAAAAATTTTTAT^ 

15 CTCAAACTXnTCGGCTCAAGCAGlTCTCCT^^ 

CCCAGAAATAAGGTCTTTATAGATGTAACTGAAATAAAGATAGAGATC^^ 

TTGGAGTCGTCCCATCTACCAGCCAAGG^ 
CTOCTGCAGAACCXXXT^GAAGGGACK^^ 
20 GAATTCClXnTGGGGCAGGTACAGTGGCrcAlXX:^^ 

aacccaggagtixxagaccagcctgggca;^^ 

GTCGTGGGGCATGCX3m?rAGlXXX:AGCTACTTGGG^ 

GCCGTCAGCTGAGATTCHX3CCACTCX:ACTra 

ATAAATAAATTTCTATTGTTTTAAGACAATTIXXXri^^ 

25 CTTATTTOANGTACAACAACCCTGCCCCT^^ 

ATTTTCTTACCTATAAAGTAGAGATTCCCTATCTCATTT^ 
CATTTGGCyVTAGTXX:CTGATTCAGAGTAAGTX5ACCAATAAATAG^ 
TAAGTATAT^rITAAGAAAGCACCATCAGTATTCAATT^XXX^^ 
AGAAAAATACATXnCAATTAAATGGATTTACATCTTAAAAACn^ 

30 CITCTCATCriCAACXX:GTTAGGATTrcGCC^^ 

AGCAGGAACATCATCTGCTTTTGGGTAGTTAAGAGTTAAGTATCAC^^ 
. GTTATCTTACTTCACAACTCTCCACrTT^^ 
ATTT-AAATGO^GAriTAATTAAATCCATGACCAGAAGACAAACC^^ 
TTTCCATCTTCAAGGTGATAACCGTAGCATAAAACTGTC:^^ 

35 ATACAGCTCICAAGGCGGAAGTGCTItnTAACT^^ 

GACACCAGAAAATAGTGTATGGATTGGTGATriTCACTA;^ 

CTGTGATCATTCAGGCATGCACAGGTATCCTGATGa 

GAGAAGTCATTCCATCTCCAAACTGTXSCCTAAAT^ 

TTGCAAGGAGAATAGCTGAAAAGATACTTAAATAAAAGTGATCGGATC^^ 

40 CCCAGAGGACCGTrACrGGTCTGCAGCCXXXrATT^ 
TTTCCAATGATAACAGCX::ATCnX3ACTTGA^ 
TAGAAAAAGTAAAGCAATACATCCI^TAAATGCTGTGCCTT^^ 
TAAGAGGGCAT^GGCTGAGAGCCXXrGGCCACCGCCGCCGT^^ 
ATOGGGTCAAGGGGGAATGGCCTGAGAAAGCAATGGATCATGC^ 

45 CAACGTXX:AGACATGTOCTCAAGGAA(:7y^^ 

CAGTAGAAGACTACAATAAAACAGGAATTTTCAGAGCATGT^ 
AGAATAAGCACTAGCerrAGAGAGAGGATQCCX:TGC^ 
AACTCTTCGCCICIXrrAACACXlACGTT^ 
TCATCGGTAATGACACCAGCACXIATAACACGAAACXXXr^ 

50 CCCA7U3CX^ITOCATACTTGATTGGAAA 

ATTAGTCOTXXnTACGACTrcCTAAAAGAAAATATT^ 
CTAAAGTGCAAATTGCCAAACAAGICAAGAGACTC^^ 
AATTATCCATGTTATTTCAATACAACCCri'iCi'i' ^ 
<XTCCCGCAGAGGTTTXnX3CC^ 

55 TCATCCCATTTCAGTTATGAGGGTGTATTACTGAGCAa 
CTAAAACCTGCAAGAGACCXXriTGAAATTTra^ 
TTCXXIAAAClTCAlGGAATTTTGCnT^Cnx^ 
. CTTTOGCTCAGAGACTTATTGmTITGGT^^ 
G^IAGATCnXXXX5AAGCACTCCCCATa^^ 

60 GAfflTrKXTICITTTCCCCAC^^ 

ATTGlCCTGCXrrcAGCCTCCTGAGTAGC^^ 

gagatggggtttcaccatgttagcx:aggctggt^^ 

TGCTGAGATTACAGGCTTCAGCCACAGCACXX:^^ 
GTTCCACANCTTCTGGCCANCXr^^ 
65 GAAAAATAAACAAGAACAAATGGAGAATCTTT^K:^ 
TCmKXXMATATmTAAACy^TTACT^ 
ATCnXSGATAAAGTCAGTATACAATGTGGTCTTTOn^ 
CAAGTTGTCTACCCTTATTAlCCCCrcAf^^ 
TTAATATTATATATCHTVCATTACATGGTAAGTGCTAAGTAAAT^ 



CTOAAT(nTrACATAGATGATO(X»TCCTOVTATTG(a 

ATAriTAACAGGTAAGGaAAATGa3GTTICa«aa^^ 

GGOmTAATTCA^^ 

ITrAAAAAACTGTiaiTATCGTTCATCTaCATAAAT^^ 

AAATATITAATCTAATlTAlTOGTTATTrATrrAACAAATAC^ 

CACrrACAACTAGTGACICArrrAACCATTICAC^ 

acaga«:agtgaaaagactccxxxxxtaci^^ 

AAAAAATAAAlT-AAAlTTITAAAAACATGACmxnXSG^^ 

GGCAA<»CAGAAGGATCX3CTreAGGCCAC3GA(mT^^ 

A^AAAAGAAAATATATATATmrrmAAATCATCACT^^ 



CA-TCAAGCAGAAG^^ 
*^«5CGAGanCTCAGACTTTGT^^ 

TGAGcmrrAACAGCcxTCCATATrmACTA N i 1 i mvi T iTmrni ^caci^^SSSS^r^ 

^^^^^^'^^^AA^^ 

CAACTCAAAGTCTACAGTCTAAAATXrATOVTCATCTAi™ 
TATTGCIXXrAGGGAATAAATCAGTrrAAAGCAAGTTC^ 

taaa™tt^a(j^^ 

TC^AAGCHW^AGTATraiTC^ 

AAQGGGAAGAAAAACAGTlTTGaATAAAAACG^ 

TAMCAAAmAAAAAK^ 

TTTGCATTCAAAlTCTCCn^AGGCAaT^^^ 

CCCTGGGCAGGTGCATGAOSGGCnCO^AG^ 

^AACTCAATCTTraAGATCACATAC^^ 
TGCAATCGCTri«XxaCCGTTrT^ 
GGTCAAACATCXnX^AO^^ 
CAG^TCCAACCCCCAAAACT^ 

CTCCCTGGCAC^ 

AAAOTAATCITATTICT 
OCTTAGCCACTGrrr 

TAGCA(»CATCAT1TCTCTGTITAATCTGTCTACT^ 
^GCTTOOTGAA^^ 

GANOCTIOT 

TTATAGAAACTCAATICAACAAACAITTCAGCACCT^^ 
T^GCAAGGCOG^^TCX:^^^ 
TAGGTaCAGATCTGTAAAATAT^^ 
GGTCTGAGATAAAOCTAAAAGAGGCnCAAGAT^^ 

^^^^^f^^AGAGA^^ 
^GCTGAGGCTOGTaZAGOTACT^^ 

GAGGTAGAGGTAGAAAGATaX5TGAGCnX3GCT«X?I«^^ 

CGGGlXXaTCAGCTGAGGrcTU^^ 

MW^^GCATQGTCXn^^ 

GCAGAGGirca*^^ 

TAAMAAATAAAAAATAAATATA^^^ 

GTOA^xxnxm^^ 

™^3^^e3^CAGA^ 

O^OCTGGOTAQGAOOOCAGCCXTM^ 

GCaClTCGAGAaiAGCC^^ 
CATCTOCAGTOCTQGCT^^ 

^^^ATOOTTCAG^^ 




AGTCGACACNGTGGQGCCATITCAACKrrAAATTAA^ 
O^AGTGCGAAACATCCGGACACATCTAGI^^ 
(nxnCCCATKX7^CAGTITXX5GTATGGAGCAOT 
mATTCGCCAGGCATQGTCGCTCATXX^^ 
5 GAGmGAGACCAGCCTCGGTAACATAGTGAGAC^ 
lOW^GCClCTAATCCCAGCACTTO^ 
ACATO?IXSAAACCCCATCTOTAATAAAAATO^^ 
GGAGGCreAGGCAGGAGAATCACTOW^CC^ 

AGAGCAAGACrcraCATAAATAAATAAATAAATAAATAAACTGGACGT^^ 
10 NGAGGKTimGGTCGGAGGATKXSCT^ 
GGGIXaJCAGAGCNAGACrCTCTIXr^^ 
GATAACAlOTAGACACTTAAATCAGAQCCTTm 

CCATAIXXAAAAAAAAAAAAAAGGAAAAAGGGAAGAAGCAAAGGGGCAATACAGGGC^^ 

ACAGACCGAG(?mX3AGAGITXXrroCI^^ 
15 CAGCTXnTCGTTAATTACXTCAAGlKW^T^^ 

TACTATCTAriTCGCCTrcGAAAGTTACTCT^ 

AGCTXXHTAGAAAAACACAACnCTCGGGCCC^^ 

CIKXrnTICTAGCAAGCGCCCAGATC^^ 

ITTrcATTTCTAAAGAOITCTTCTCCAl^^ 
20 TGCTTCACAlTCrAGACACTCAGC^ 

AATCAGGArcrcCTACAATCGClTCATATTT^^ 

TCTICTATITAAACTACTAAACITAATAGTTTAAA^ 

TCACACCTITCAGCTCAGTTITnTCT 

AATTATITITCCTITAGCOSanCTGGIOGl^^ 
25 cCACGAGaX^VGGAGTTXXSAGCXriX^^ 

GTCTCTCAAGATACAAGAAGGCCACAT^nXXyi^^ 

TCcrrGAGAAGTTTGAGATTCGCCTXXS^^ 

TTAAAGAAAGGTATITCTCCCXriTCAC^ 

AAANGGAOTAATOXriTNr^TCCCtmx:Tt^^ 

30 arrvrou^f^pj^cATA^^ 

GAGGATAACAOTCTAAGATCTCAGATriCAA 
ACSGATCGTCATn-AGAACCTGGglTCK^ 
CCAGmCCAAAAGAGCAGGTAAGGAAAACACTGAGTGOT 
TAAAAAATAGAGACAAGGAGGTGGGCGCX5GTCGCTCACAC^ 
35 GCCTCAGCTCAGGAGGGCGGGTGGATT<X:Cr^ 

CTACTAAAAACATTAAAAAAAAATTTAGGCXTiaSTG^ 
TATCACTTGAACCGGGGAGGTOiAGGTOSCAGTGAGC^ 

CTCTATCICAAAAAAAAATAATAAAAATAATAAAAAAAAAATTACATGAAAAATAGAGACAAAG^ 

40 CACGCCTCACCTCTTCTTTTCAATACC^ 

AACAGTCGTGAriTAGGGCATCAAGAGCTCACTCAT^ 

CTTOXyrcACAAGGAACAGAGTITTACTI^^ 

CTGAGGAAAGGAGGCAGTTTTGGGTAACNTGT^^ 

TAGCTGATTCTXX-nXnXXTC^ 
45 GACCGTCGAAOTCAACICAACCTCATO^ 

TTCCCTANCXXSCAGAAATCCACKX^ 

<XrriX3GCTCATOTATGATTCGACCTX^^ 

AAACX:CACCTAAGTAGAAGGACCTATCAACCATGAGGGT^^ 

TOXyVOTITAGGGCCAAGTCACnTGGAATCACAGC^^ 
50 ACCCATraXAGTICCCTCAGATTOVTTACXS^ 

AGCrcCAACATCCCTCGG«TGGAGA^^ 

riTITAGraXSACGCATXIM^ACTCAACT^ 
TICAGTCAAGCAGTrATACCTICAGAAAAGAGGGAQC^ 
55 CATAGGAACACTCATAGTATGTATCAGAAACTAGTGAACACAT^^ 

gccictitogggagtgttcix::aatgi^ 

GTOCrxmCNACTCTAACATCCTTCCAC^^ 

TCATGlCTAGTCAGAGTVrcCAlTCTGTT^ 

CTCAGCCAGGCXAQGAGTATOXrCTAATQGTAGA^^ 
60 GCCTCGGGACGTCCCTTXK3GAAGCTAT^^ 

TCAGAGTCTCACACAATAAGGCCATAAAGA^ 

CAC T X' lUXJiT ATAAATCTTCACCTAGCAT^Xy^^ 

GCAAAAATAGCCCCCAAAATGAAAATTAATGITGAGAATCTAG^ 

ATATITCACATCGCXrATATATOITriGC^^ 
65 CACAGAAClWCCTXnxXnCCCTCTTT^^ 

GGCTO?K?rcmTICTCCCATXnG^^ 

TTCAGGGAAAGATCTCAGAGCAGGGCCAAGAAAGGGAAA^^ 

AC3IAGC3mXXn'AGGGAGGGAGGNAGAGACATGGAC^^ 

GAAATAAAACAAAAGGGCAGGAAAGGAAGAAATAAAGAGAATGAGATAAAAGGAAAAG^ 



TCTATAGAOTGGGGAAAGACATTC^ 
CNTITOKXriCTANGGAATt«rrcGT^^ 

GTCAGGAGATCCAGACCATCCNTGGCTAAGTCXXnXS^^ 

GGTGGCGGQTCGCCTCTAGTTKXX:^^ 

TTGCAGTTAGCAGAGATGGTGCCAC^ 

TCnOTITTTGGAAGA^T^^ 

CCGGGAATCCAGTTCnTITAGGAAAGGCCCAGTi«^^ 
CTGTAGCTATCTAACAGAGACAGAGTrcACITGC^ 

cacxx:aggaaggcagtgttcttctttt^^ 

TCCATATOCAAGCAGCTAAGGCITGTGCCTT^^ 

CCOZ^TCGCACAT^ 

CACATTITTTTTTTXXyATGAGAGTACTGA 

ATATTTACACTCmCAACACA<nTCTC^^ 

GAAATATGAACCCAATATGTTCTTAACATCTCrrATG^ 

AGAlTAAAGGATTGTACCTATaSAAAGAAGAGAATOSAAAA^^ 

GAAGGATTTGTTCTATTGCATGCTGATAAATAm 

^^TGCAGCTTCTCAGGGCCmxr^^ 

ATGAGGTTQGGTAATGAAGAGAGAGAGACCCCAATAAACATAT^^ 

ITOGTTAGCACAGCXrKnATTACACAGTACT^ 

CATTATCnTATOnAAATTAACACTGTAT^^ 

TATTATTTAGTAATCCCTGAAGCTCTCAAGTAG^ 

CGCATGTTAAATGGCCCACTCXXriTlvriTlG^ 

GTGTGGTACAAATGAACTTGCTACriTGGTT^ 

mSAAATC^^ 

GTTTTCTGCAATAGAAATGCATTTCAAATATAAGCTAGAAA 

AGATCICCAAATGTTTTGAAACGCAGTGATTTATTO^ 

TCCTIAGATCTTAATCAGAAATAGGAAAA 

TACTTTAAAAAAAAAAAAAAAGAlXXriGACAAATCCTACTTT^^ 
TATITTTCCTCrcTCTAriXXIATAl^ 

cccx:aggacaggtgcotxxtcagcggagacaa 

CTGCTTCTCTACCGGCGCAAATCXnCCCCaAGG^ 

AAGGGGCAAGGGATTGAGGGAGAGTCCTCCTGGCATXr^^ 

CAAATGTCATCTTGAGTTCTAGTTCTCATAAT^^ 

TGCAGTrACCCC»TGATGrrTTTCATGGTAGGGAG^ 

TTTGCTTGGCACCTCriCCTTCC^^ 

TGAGACCTCCACAGCCATGCCAAATTGTGAGTC^ 

TATTAGIXXXT^TGAGAACAGACTAATACAGTCAGGGAIG^^ 

TTCCACCCCACCAACAAAGAATGATTCAGCCXAAAA^^ 

C^^^'TTGGAAGGTTGCCAAACTGTGCT^^ 

GCTCATGCTCTCTAAACACCCCAGACATCATAT^^ 

CTOTCTTCGCTITTCCCT^^ 

CTTGACCTTCXriXXnCTC^^ 

TCCCTTCACCCTTCATGTCTGAT^ 

CCMCTGGCTTTATTATATITGGAAGTGAGCC^ 

TCCTGCATAAAGTGTTCXTTACXATI^^ 

GACAGGGTCPCACTCTGTTGCCCAGGCTGCAGT^^ 

AATCATTCTCCTACCTCAGCXnCCT^^ 

TNAGTAGAGAtXXXSGTTTCACXZAlCTTGACCAGG^ 

CCAAAGIG CTQQGATTACAGGTXmSAQa 

CTTTTTCAGGACCATGTCITGTGTAGGGC^^ 

GGTOXnCATGCXTTCTAACCCCAGCACT^ 

TGGGCAGCACAGTGAGACCCa?rATCTACAAAAAATAAACA^^ 
CTACTCAGGAAGCTGAGATGGGAGGATTGCTTQGGCC^ 
TGCAGCCTGGGCGACAGAATGAGACCXnxnCTGAAAAT^ 
GCATTTGGTGGAAATAGGGCA CACAC ATG^ 

gcttgtccctcx:atctoggtgctitc^ 

TATTTAAGCACIGCCAGTGATTCTATATAAAC^ 

^CTAOCCAGGATGGOT 

CXriTIXnxnTCAGGTTTCTGrAGTGT™ 

CCCCCAGTGTTTCTGC^ 

TGAATTTCCAGGTTTTCAGGCCTGCCTGGT^^ 
GTAAAAGATITA CAATT ATTTCATTTICAACATAGCl^ 
GCXrrACAGGGOXSCTTTAGAAAACATCTGGCTGGGCAT^ 
AQGCTXWIGTGGGAGGATGGTTAGAGCCTAGGAGTTCGA^^ 



AATAAAAAAAATTATCTOGGTATAGTGGTCnXXACCT^ 

AGCOTAGGAGGTTGAGAGTGCAGTGAGCCGTGTTGC^^ 

CAAAAQGAAAAAAATAATAAAGCACTGTCTCTCTCTACCCT^^ 

GTTTATAGGTGAGTGATATGGTTTGGATGTC?IX^^ 

TT>GTCGGAGGGACCCAGGGGGAGGTAATTGAATCATGGGG^ 

CTGATGAGATCTCATGGGTGTATCAGGAGTTT^^ 

TGOCTTTTGCCTACCACCATCmTCTGA^ 

CAGTCTCAGGTATGICTTTATCAGCAGCGTGAAAAC^ 

GTTXSTACTTCAACCTGGAACTGTCTAACTCTT^ 

GAAGAGAAGCACCAACTTITGTITTCCAATTC 

GCTTCCCTACXAGriTOXAAAGCaTCAa 

CCCAOTTACAAGGAArrACAGAAGAGTGATGCCCCCITCA;^ 

AGGGCCATATCXXrTCAATCnxrrcACCCC^ 

CACCAGTGGAGTCCCATAAATAGGTTGTTTATGATCCC^^ 

GGGGTATGAGTCGCCCCAGGTTTaxriX^^ 

T lHUCi' r AAAGAGACAACTgrTITrGGTGAATAGGCX^ 

AGATATCGTTTGCATGGT(?nTTAAAGCTAAAACT^^ 

CTATNGQGGTTGCA™«rCACGGAAGA(3OTT^^ 

GCCATCCAGAGCTTGCCAGCAACCriTTAGATC^^ 

TGTIXXrAGACAATGTATATTTAAAAAGCAATTATTTGCC^ 

TCTGGCreGQCATTGTTrATCTCCACXriOV^^ 

GT(XXX:ACCACTA<XX:ACAGGACXXnx:^^ 

TTTATTACTlXrrAAGTTCCATTCCAAAAATATGC^^ 

CX:AGCTTCTATTTTmTCCAC:ATAATCCCTAACC^ 

ACACAGAACCAAATAGTCATAGAOTAGTATTTCAGGTGGGGGGC^ 

GGAACATTTGGCAATGTCTGGAGACATIXmXS^^ 

GTOTXn^ACTAATCGCCCTCTATTGCACT^^^ 

AGGCnX^AGAATKTITGATCTAGGGGAAGATGTGATTTC^ 

TCTCTTTTCCX:ACTGAAGTTGTGGTTGTAAr^ 

ATGTAAAAAGGATTTAGACGTCTGGACGTGGTGGCTCATACXriCT 

TCGGTTGAGGTCAGAGTTCAAGACCAGCCTGGCCAAC^ 

AGGCGTCk?rAKK?ra^TGTGCATATAATCCCNAGCTAC^^ 

ATOGCACTAClTCATTCXAhKSCCTGGGCAACAGAG^ 

AATGGTACAGTCAGAGGTTAGGGGTCGGGTCAAAGCGATCCC^^ 

TGCGACXZATCCriXriTTACTGCTTCTGC^^ 

TGAAACAAmXXXXriTCCCATTGTCAAGGCA^^ 

TTCXlATTGCTTTOGCroCAGTATCCl^ 

AGCCmxx:AT^K5GTGNTGOTATAT™3AAC^ 

A^K;CAAANGGCATlTKrrcCAml^CCATGT^^^ 

AGAAGCATATGGCAGAATTCATATGTGAACAGATTGCTTTT^ 

GCTITTTGAAGTTACTCTTCCTGCTCTT^^ 

CATTCna^CTANTGriAlCTTTACT^T^^ 

GTTTCCCATAAATTAATAACACATCTATGAATATTTCCCAi^^ 

GTAAACACAGGGAATAGATGAAGCAAGCTGCTrCTTT^^ 

TGCCTQCTACTTTCATTTACCATCOTACTC^^ 

A^m3AACATTK3GOTCTC3aAAACTA 

ATTATTATTATTATTGTCmTACATCa^CAGGCT^ 

TTTCTCTCACTCCAATTTGAATXXTC^ 

CTCTTGCCCTATTTG(XX?IOTCTCT^ 

TTTCATGATAGAAGTAGGTCATGGGGGCGGTTAGCTGACT^^ 

GGTCTGCX:AAGGAGAGTTCAGCX3GGCTGGATCAG^ 

(XAAGGTCACAGCTGGAAATAAAACTGTGTATTCCAAATCJ^^ 

AAGTTGTCAGCCAAGAGAACCAAGAGCCAGACACAGGGAACC^^ 

CATTCACCAGGTGCCCAATATTrTAATGTCCCATQ^ 

AGGGGCTGGACrcAATTAGACCAGGGGTTCrKy^ 

TAGAGATGCAGATCTIXSGGCCCCGCCCCAGACCT^^ 

TAOSCTGCAGTCTGTffrTAAGCITTCTG^ 

ATAGGAGAGAAGCCACTAGGGGTGGGAATCAGAAATACGGTCTCXXiri^^ 

CTTTrrAACTCTCTCITCCT^^ 

GGATTTTTCTACAGCTTCATATTGQGTTACCAGTG^ 

CTAAGTGACTTCAGACTCTATAGACAGTCCC^^ 

AAACCATACACATCCAGTAGAAACXrGAACTrTACAaTTI^^ 

GTAGGATACTCTCOCACAATGTTAGGCCACAGCAGCGAGCT^ 

CNTAOZAGGCACGGTGTTQCCTGATAACC^^ 

TAAATTTCCACCTGTTTTXX3GGGGGGTACAGGTO5^ 

TTGGTCCACCCATCACTCXAGCAGTATAC^^ 

CICTCAGTOCCXIAAAGTCCATTCrATCA^^ 

CACAGTCSCTTQGTTITCrATTCCTGAC^ 



ATCAGTTCATTCCTITTCATGGCTGAGTACT^ 
TGQCCAITTGGGCTAATTTCACATTTI^^ 
ATAAGGACTTCTTlTCCTCTa 
AGAACrCTCCACACTGTlTTCCA 

cgcattcaccxx:aatatctattattt:tt^^ 

AACAAAAACAAATAGGTGGGy^CTTAATTAAACTAATCAGCTIT^ 

aacccgcagagtgggagmaatctatacaaatctatacgtct 

AOTAGAAAGAAAAAAGCTAACAATCCCATCAAAAAGTGGGCTAAGGACAT^ 

AATGACCACCATACGTGAAAAAA'IXXrrcAA v^^iauju^ 
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BAC#1 contig#208 

ttttagttgacatgtcattatggattcttgggcataaacgtttatatgaattttgaatattaggaaataatcttggaagc 
tatattagttttctaaagctattataacaaattaccacagatggagtgcctcaaaacaacagaattttactctcttacag 
tttcagagaccaaacatctgaaatcaagtggtttgcagcgttggctccttctggaggctggaaggagcatctgttccatg 
cctttctcctagcttcccatgactgccagccacccttggaatttcttggcttttatcttagtcagtctaatctctgcctc 
tgtctttccatctccttctctgtgggtgttccttctccccttctgttttgtaaggataggcctgtcattggatttaaggc 
ccaccttaatccaagatgatctcattttaagatgtgggacttaatcacatttgcgaagaccctttttccaaataaggtcg 
cattcacagatccctaccattcaactcactatggaagcaatgggatgggaggtaccattcaactcacagattcctaccat 
tcaacccactataaagcagtggcatatgaaaaggatgggaggttgctatgggaatggtctgccccagatgcaggtaaaat 
ggggaagcattttatatagaatggaaaaacaacagtaaagctggctaaaacttgatctgctttttacagtcatgtgctga 
caattctaaacattatcaatgataaaatacttctccctcatgatattaactgtgcaatttttgctgcccctgctacatac 
atgctttattgcttggaaacaggattacttggtgaaataatgtgaatctttcaattctcatgatacatgttttcaactta 
atatccagggtgcaggaatatatattcccaccaactttatacaagataatacaaatataatagtatgaataaatatgcta 
atttgatagaaaatattttaatatttactactagtgtagttggacgtgttaaatattgcaatggttatttgtaatctacc 
tttataaaatttctaatgtctgccactgtttctgttagactcttaagaagatacgtgtttgtctcatagatttgtaagaa 
ctctttatatactaaaaatatggggcttctatatgattgcatatattcgtcacttttgttccactgttttagagagcaga 
agattttaatttctataaagatttttatgtttccctttattgcatgccaatgtttgttttttaaaaacttgtacgtctta 
ggctgcgatgtgaaaagtgtcttttttttgatttggaaacagagtcttgctctgatgcctgtagtggtatgatcatagct 
cattgcagcccttgaactcctgggctcaagtaatcctcctatctcagcctccgaagtagctgggactacagccacacacc 
accacacctggctcactttcgggtggttgttgttgagatggggtcttgctgtgttgctcaggctggttttgaatactggg 
ctcaagtgattatcctgccttagcctcccaaagtgctggaatttcagaggtgagccactgtgcccagcctcctgtttttt 
ttacatgtatatafeatatatatatatatatatatatatatatatatatatatatacacacacacacacacatatatatat 
acacacacacatatatatatacacacacatatatatacacatatatacacatatatatacacacatatatatacacacac 
acatatatatacacacacatatatatacacacacatatatatatatacatatacacacacacacacacacacacacacac 
acacacacacatatattttttttttcttttagatggagtgtctctctgttgcccaggctggggtgcagtgacatgatctc 
tgctcactacaacctctgcctcctgggttcaagtgattctcctgccttagcctcccaagtagctgggattacaggcgtgt 
gccaccacgctgggctaacttttgtatttttagtagagatgggggtttcaccatgttggtcaggctggtcttgaacttct 
gacctcgtgatttgcctgcctcagcctcccaaagtgctgagattacaggcgtgagccgccatgcctgcccttatttttat 
atatcttccataacctttctatctctttctttctctttctctgtttctgtgtgtgtctgtgtctgataataaaatacatt 
aaattttattttataacataaatgttttataaggtcggtagatttgtttatgattttaatggctgcatagtattctatca 
tatggatatattctaaatgtaataattcctcaatggtatatttagtttactttatatttgctgctttagtttgtcacaaa 
cactgtggatagccttgtatatatttattttgttgtatagttcttattggtacagtatgatttttcacaagtatacaatt 
actttctgagaatttagctttttgattcacgtgaaagccttacttgtattgttaacaattttgttacgtttgtgaggcct 
aggcagaagga t tgrc t tgagaccaggaac tcaagaccacct tgaccaccttgggcaaca tagggggagcccatc tc taca 
aaaataaaaaataagaaataaaatagccatgcatggtggtgtgcccctgtagtcccagttactcgggaggctaaggtgga 
aggatcgtc tgagcc tgggag 1 1 tgagac tgcagtgag t ta tgaccacaccac tgcac tccagcc tgga tgacagagaaa 
gatctcgtcaaaaaaaaaaaaaaaattaaatatatacaaaataagggagaatactatgacaggcttccactta-cctatct 
ttaggcttctttataaaatgttcatttaatctgatctctataagaaattctgtatccttaaacaacatgtgcccttctct 
ttacacagagcgattacccccttgctatgttttattaacagctagcaagtgctgggtcctgtgtcagttgttttgttttt 
tctttttctttgtttttcttttcttttcttttcttttttttttttgttttttttttttttggagatagagtcttgctctg 
ttggtcaggctggagtgcagtggcatgatcttagctcactgcaacctctgcctcctgggttcciagtgattctccggcctc 
agcctcctgagtaactgggattacagacatataccaccacacctggctaattttttttgtgttttgacagagatggggtt 
tcaccatgttggtcaggctggtctcaaactcctgacctaaaatggtccacctgccttggcctcccaaagtgctgggatta 
caggcgtgagcccccgcgcctggccctgtgtcagcctttttaacagcataatttttcctttattagtcataactacttta 
agggc tat t ttgat t tacagacaaagtgagac t tggatgaat taagaaagaattaatcaggtggt tgaacc tgagttcca 
ac tagggtgacaac tc t tcgcagat tgcc tgccgc tg tec tggt tt tcacac tgaaagt tccg ta tec taa taagcctc t 
ca tec tcagcaacggaggacag t tgcccac tcaaagcc t tgtgacat tc taggg tatec tat t tc tacatggtgggaa tc 
tagaagacggagaaagaat t tc teat tcccgaaggagcatggattatcct tggt tgtag tgt t ta tgt tecaca tcacg t 
ggattcccgaaggagcatggattatccttggttgtagtggttatgtccacatcaeatgggatattttattttteaggcct 
cttcatgtgettgtgtgcaacgcagcaacttttgctctaccctggagtctcaccaaagatggcctggagaceacctttca 
agtgaatcatetggggcacttetacettgtceagctcctecaggatgttttgtgcegctcagetcctgcccgtgtcattg 
tggtctcctcagagtcccatcggtgggtttgaattgcatatttgttcacttatcccctttctcataccagctaatattee 
cecaaggc tctcat tctgaaaataatt t teat tagtcc tgc t tgagacatgtgggtggac tcage t tggc tcac ttaat t 
tttccaggtcttttttgttegcctgcgattgtgggggactgtttagaaggactttctagagcaaggaagattgcetttac 
gactatacttcaagctcctcattgattttcgcttacagatggaataataacttcatgciaaaactcaatggcatgaaccta 
ttattggatttgtaattcaacaacttcaacatcttaccaagaagaatgtgcagttattetagcaggagaaacaatgcaat 
tagagcctgegaga tgaaateaaa t tg 1 1 1 tataa tgagaaa t tagggaat tcgaggcagacattage tg tg taattgtg 
gaaagggaagaactgtagttagagcatattagaaatctggccgtgcctcttttggttaaaatttcaattaaaacatcag 



Figure 18 



BAC#1 contig#779 



aacacatttact..aacttgataocatgaa"tSc\"5St^^^ 

taaagccaaggtagaaacaatgaccctggacctcgctctgcLcgtagcgtgSatttt^^ 
^aatgtgtgagtgttccagtggagggttatagatcataattt^^^^^^ 



a 



ttttcttgactcacagtcaccttcattatgagatgtgtcatcaatctaataacagcttcLacaLccSaaa™^ 
actattaaagcactagtaaaagtggctaataaaagcttggcaatagtaagatgcLcotgattataS^^^ 



agctacactaattacaatactttgtttccaagtgtttatttttactcatatttagggcaggca^tcct^ttSSS 



ggtttcaccatattggtcagactggtctcaaactcctggcctcaggtgatccacccgcctcaq^^ 



;ctagtccta( 

ggcgacagattgagactccgtctcaaaaaaaaaaaaaaaa^^aa^a^""'""""'""^^'"''^'"'^''^'"''''^^''"^^ 



attgtgaaaccccatctctactaaagatacaaa^^^^gct^^g^gtlgtgg;^^ 



cctgtaatcccagcactttgtgaggccgaggtgggcggatcaccti::^?^^^^^^^ 



attgaggtctacattagagtcagtcttagtgagagttgcaa^lS^^^^^ 

agtgagataacatcccccagccagccagatctgagagagccctcctgtgtgtgLcttScag^^ 
cttccaccctttgccaQaactatca;,o;,m,r-^^^^==^, .—---..^^ ^--"-tgcagcgcccggctgttagtag 



gcttcgtcattcatttatccaccggttccttcacttcacacacatttcaggagLcctctafaSccaal^ 
gggctctgagaagagagtggtggccctggaagccacagtaaccgcacacSgctgtcaccte^^^ 
ctS^^r"^"'^'"^^^""^^'=^'^"'^^'^«^"3"'^^9agggaatgatLtgacaag^S^^ 



o^^^^^^ 5^ ^^^^'^^^^'''''''^''''^''^''•^S^^^gctgaagtgcagtggtgcaatctctgt 
gcaacctctggccttgggttcaagcgattctcatgcctcatcctccaaaa^«^^^„L.^..:L™!::!:!.?"''^ 



ctatactgtgtttcattttag 



cacagctaatttttgtat.t.^T5a|:Sa\1tt^SSS^^^^^ 
ctctccaccttggcctcccaaactgctggaatt^caggcgtgagccacLtgcLgaccc«t«^ 
aaacagactcagagaatgtgagtgacttttcgaagggtcactagcaagttggtgacgggcSgg 



aaacagactcagagaatgtgagtga-^^^^.o«„„;^.:„:.:_„:;::!:^'^^^=^^"'^=^"'^'^^tgtgatag 

igca 
ttg 

agttgggaaatcatcaagttcaagtgctttttgtaatcaccaagctcaaaactttiga 



jactttccgaagggtcactagcaagttggtgacgggcatggggtttcactccagtct 



ggaaactctcacccattttgccctaatcccagattttag«caggactttgg;atttta^^^^ 
a?ctc^rf^^'''^'^^''^'^^^^^^'^3'=^'^^^9=9agagatttLcttttgaL£^^ 



caaagi 
attta; 
tttcti 
aggagi 
caagc< 
3ggatc 



a^ctc^tl^aaacf'^r^^^^ 

aagtg 
ttgtg 
jagga 
:caga 



ttaaaattttcttgtg 
tgca t tgaggagagga 



tgcacacttaaccatgattatccctactttgcaaattacaaggttgcagtgcagagagcctaagacactttctcaacttg 

agcaagtaggagatccagaatttgaacccaaatctctctgaatccaaaacgactttttgttactttgcacacttcagtca 

taggaatttcagtgtggatttctgtattggttatgtattgccgagtaagaaatgactctggccaggcacggtggctcaca 

catgtaatcccagcacctggggaggccgcagtgggtggattgcttgaggccaggagtttgagaacagcctggccaacatg 

gcaaaaccccgtctgtactaaaaatacaaaaattagccgggcatggtgatgcatacctgaagtcccagctactcaggagg 

ctgaggcatgagaattgcttgagctctggggtgggcagaggttgcagtgagccgagatcatgcctgggggacagtgagac 

tctgtctcaaacaaattactccaaaatttagtggctaaaaaacaacataaaacaagaaacatttctgttgtgcctcctct 

tttcctccctccctctttctcatagttaatttgtttgaatcaggattcaaataaggtcgatgcatttttaatctataggt 

tttgtctcctttttttttttttccttaaaaaaatttcttttaatgcaacagatttaagggaccagggtgttctctagagc 

aggggtccatgacctctaggccacaaaccaatagcggagccgtggcctgttataaattgggtggcatggcagggagtgag 

tggtgggcaagcataggaagctgcgtctgtgcttagagccactctctgtcactctcattagtgcctcaggtcctccccct 

gtcagatcagcaacagcattagtctcataggaccgtgatccctattgtgagccacacatgcaagggatctaggctgcatg 

ctccttacgaaaatctaatgcctgatgatctgtcactgcctcccatcaccttcaagatgggattgtctagttctaggaaa 

acaagctcagggctcccaccgattgtacattatggtgagttggataactatttcattatatattacaacgtaataataat 

agaaataaagtgcacaataaatgtcatgtgcttgaatcgtcccaaagccacccccactcctttgagtctatggaaaaatt 

gtcttccagaaaaccacttgctggtgccaaaaaaatttggaaacctctgctctagagatttctgcagtctgtactttggt 

gattgcatctctgtggtgtcttttaacatactcttctgtctcctgagtbtcctgtgaaagggtagataactcaagagact 

caa tcaac t tcgtgt t taa tg t tt tcgtaagaa tac t tcagaggcag tgg tg taa tcgt ccaagag tacacacaggcc tg 

atcgccaaagccagttttcttcaggcccagcacgtcctgacGtcctcactcctctgcccactctctgtttactcactccc 

attttcattcattgattctgttattctccactaagttttctgggtggtcagcccaggaagcctcagcctgcttccagata 

gttctcctcagagttagcggtagcagcggtcgtcggccagtgtgtctcgggtcaaattattttgggcattactggaaagt 

tccaaaccaaagcctgtggtaagttccacccaaactggcttgtgagggatatttttgtttaaactgcattcctggagagg 

agccagacttcctactgcctggcagcagtggtgtccataggagtatgaagagctgggaccctttctaatcactcagacca 

aatattggggttttctgaaatggactgagaaataacagtatgtttttatgaggctttgcacattttccttcctagcagta 

gctgcttagtctactgaaaagtgcatattttgaacagggcctagaagagttaacagctcctagagagaggtgctctgtaa 

tactttttcttcttcaaaaaatggtttatggctgggcgcaatggttcatgcctataatcccagcgcttttgggaggccaa 

gg tgagagga t tgc t taagcccagaaacgtgagaccagcc tgggcaacg tag tgagacgc tg tc t c tgtat tacaaaa 1 1 

ttttaaaaaacggtttgtaggtccttaagtccctgataaaatagagaactgaattgcaatcctggaacttaaaaagttgg 

tgacgacacctgagatatttattacttagattgcagttactgggtcagcttgtataatactgaccaagggtttttgattc 

ttcctggaattgataggaaattcatattaaaataattacccaagtccaaacatttttagaactgcatttttgatcatgga 

tttttatgtctcttctgaactttctgtcaccggtataatttaaagaaattatacttaagctttgtctcacttagaagata 

atatagaacagtggtgtttttttaattaaaaaaaaagttaaaataacggttttgtatccttgctttacttcttaaacata 

tgggaggaaaaaaaatctttaacaagtttatttattttcattttctgctaaattactttcagaacttgaatctactaatc 

ccagatataatattcttggattcatattccaaattttgctgtctcaaatccatctagggaagtgggtgggctataaatta 

taaataaattccaaattttgtgggatgaattaccctgaagaccaacgtgtaaattacatattaatctttctttttctccc 

tagctcggttttaagaataatgttttagccaacatatctgcattactcttggctcaatatgagaaatccatttttggttt 

gcc taacagaaga tea tgt tgc tt tgc t tc tc tacacag tatgaaaacccaaagaaaagaaaaacagaggcag 1 1 1 1 1 tg 

ctctaatgaatgctctaaatctagctcttaattatgattttttaaggaaaattttgaaaagtctacaagttaaatttttt 

tttctatcccatacattttccatcctaaggcattgaaaaagcacactgtgaaatacttagtgtatctagaaacatcaggg 

aagaatgcttccctcctaagcaaaattttgccttctgaaactttttcagcattcagtctttttatataatacttagaaaa 

atatttctgaaatagatcatacactctcttcccaaaaacatcaaagtatgaccgtaaagggcagaggtaggtaaacttct 

tgtaaggggccagagagtgaatatttgagaattttcagactattaagatctctgttgcaactgcttgcttttgccgtggt 

agcctgaaagcagccgtagatagtatggaaatggatgatcatggcagtgttgcaataaaactttacaaaacagacaatgg 

gccagattggccaggggccatagtatgctacccctgggcaacaacctgtatgccctggagtagtgtaaagaacgtgggtg 

ttgggggtcaactgacgcttccagctctaccacttactggctgtgtggctttgggcaaactactgaaaatctctcagcgt 

cactttccaagtgtgtgtaatgtgtattttcacagtgctttgcaggttgttgattattgaaaatagccataatgcatgaa 

attaccagacacatctcactttatggagcctggggctattggtaatatgcatttctttctcatcttgatcgtaaaatgat 

cttagaaaggtttctgagaatatatagagtttaagacagcaataagacaactaattaattaaacaggaaaaggggatgtt 

gtgctcagagaggaagtgtgggtctcataagggctttcacaatcgtttgagaggacacgtgtgatgtctcatgcctgtta 

tcccagcactttgggaggccaaggcaggcaggttgcttgagttccggagtttgagaccagcctcggtaatttggcaaaac 

cttgtctctacaaaaattacagaaattagttgggtgtggtggtgcacacctgtagtcccagctgcttgggaggctgacga 

gtaaggatcacttgagccagcatggtggaggctgccatgatcatgctactgcactccagcctgggcaacagagccagaac 

ctgtcttgtaaagaaaaggaaaaagagagagaagggcagaaagaaagaagggaaggaagaaaggaaaattgggcccagga 

atgatctttacaatgcctgacaaccaagagaagaagggaaatgagcttcacattgcctgcaagctctaaggtgacaagag 

ccaagagaaa t ta t tg t tac tg tag tgatgttccac tgaggatcataaag tac 1 1 tat tac tc tac tgag tatgg t ta 1 1 

ggatatgtgttcttctttttctttttctttatcttttttgctattcttttgttattcttgatttatgctgatggaaagcc 

atggacccaaggatgc t tcacagt 1 1 tc 1 1 taggag taaa tgc t tagat tcca tgt tc t ttgacatgagc ta tg tctg 1 1 

cctctcgagtggaagcatccttttcagatgagttgccagaaaagcagccagctctggataagtgaggtacagcagaacac 

actgcaaatactaggaatccttaagtacagtggaaccccaaagcactctacctgctttctttctcacctccttaaaaact 

ttttttgccctcacctcatcatttattcagcagtcacaacagtgccaagaacttggctagagattggaaataaagcttat 

gccttctctcatatctcctggaccttatttctttcttacaagaattgtgatgcttaaccagtttttttgataaccttttt 

ataaatgccaacccttccaaaaaacctgcccccctggtggagagaagaattattacatcaattaggggtcacttagcatg 

acatttgtcggaaaaaaaaaagt tag tgagcc tt tttgccatat taaaagtcatcactgccaagaca taaa tgaaaa tgt 

gttcgaattaaccacaccaatgttcacaaaataaacatttttgatttcccaacagaatcctaggtttaactatcactatc 

atctttcatgaaatcaaagtcatatatgtaaattgaacacaactttcccttccatagagagtaaaaaccacgctttggag 

ggtagatacaattaccccagggttgtcttttcccactcctcacaatcccaccagtgcacatgcaaggtgatgtccttcct 

ttagctatagcaaataatgttaattattgttggtgttaaataatgattatgtaaagcactagactagacattcgtcggca 

aagtttttctgcaaagaatcagatagtaaatatttttgctcttatcagccagacagtctctgcggcaaccattcaagcat 

tgttgaatacattgttgagtgttacatgagactattgtaatataaaagtagcctgggacaacacataaataactgggtgt 

ggctgtgtcctaataaaactttatttacaaagaacaggaagtggcttggatttggtatctggcctggcagctgtggttta 



ca.tgtt.aatgg.tca.ccC.gC.caat.at.aaSS^^^^^^^^ 

aaggctagaacttcgtccaaaatctcttcttctgCtgagctcagcttgcatagSSag 
:tcatttgggaaatataccatgtaaaaaacattgtttctaaaggagatttgtcccataaata^^ 
:actgcctagtgggacaattagaaaggtcagttcaaggttggfLagatgcScttt«^^^ 

jccaattttctatcaccaaagggaaatcgttttgctggaatatgtggtaaaggaggttSa^^^ 
:cgctcagttaaggtactcagactattttccaaccaatcaaaaLggtgct?cttSt^^^ 
rcagctgttgactgtcatttgcatcatctttaaacatttactgtgaatJtcLtgtS^^ 

tcccac.g.cca.ca.— 



aaagatgg 
atacttaa 

ttggaa 



cacagttgatgagttatt 

tctgtgctacaacttatta^'^^.ISSS^^c^^^^^^ 



gtgccgattctcagtga^agaatctt^^^ttgal^^ct:^^^^^^ 



^^^^^'^'^^""^^"^^^"^""'^^•^^^^^^^^^ta'^^tttttaatattgtg^gt^^ 



cataggaaaaatgttaa 
ttttctacattttccat 



aatttagtgaagatagtaaat.aacaaagtggaaaagLtgSSt"^^ 



tataccttatttgtaaacatcatgggtggggggttgcagtaaacatgttggaaagtagggttggaggtccgtagaaattg 



atgaacattattgtgattat 



ggggcttcagcacttcccccaagctcaacaccaaccccctttctgagcccctcttgaaggagagttccctgggacgtgcc 

tggtattggtacaatcagtcaggaagcatttttcctggggagaaacttacaagtccacgatcaaagccaacaagagacaa 

ggtgttacatgactcattttcggtttaagaagtgacaggctgattctaagttgggttcaattattttgttaaagcgtttt 

gcttatttgacttctcctgacctcggaaataattctaaccaatcagtgctggctcccattggccctggggtctggttgct 

ttacagctggtgacagggggaccactccactaccacatgtgaattaatcctcaactccagagccaagtgccattctccag 

caaggttgtatttcttcattagctattcccagggcccagaaagtcccagaggatgtcagagtacattaatttttatcata 

acatggaatctttcaggtctgaatggcagcacacggctgtcaggggcttctgaactctattacagctccatatatctcta 

ggcaaaacagaggaaagagtcgtcattggcaagggagatgtacaaaatgcatgagatgttttattttttgagtgacttga 

ccacgtgcttaagcacattccccaaacaatttttttcttattgtttgtaagttgtaagttgtaaattcacctctgccacc 

acctattaaagcccactccctgcattaaaactgtataaagtgtatttaaataaactctctttgcatgatgtgaatgaaat 

cgtcatctggtacttaaaactattctataaagttattaaaaaattaatgttcccttcccatgatttttctgcagaattta 

tgcatccatgatactgcagaagttcataaataatggcttgtattgctgctttagtattgctttatgcctacgaaatataa 

tgttaatttgtagcaatgctaatgtgttttcaggaaggctctttgtttattgcctttattttccccacttaccaagtggg 

taaaatgctttgagggttgcattttatgtattcaggaggcccaggtattattttaatagaagcactattgacaaatacca 

gtcatcccccctgtgccaggccctggatgaggcactgcttcgcatgggggctccccagattgtcccacaaggaaagcata 

gtcaaagacaaagttttcagttgtaagagtaaatgtgttctgcctaggcattgtcaagtaatttactgccagctctagcc 

cttcactcaagtttcctggatacttttgacttcttagccatggatgtgtttgaaggctgcatggaccttcacttacttgc 

actgcaggtcagcctaattgcatgagctctgtggaccacagagcagggttttccaaagttcaccaagacaaatattgtat 

tatcttaacatatattcattttttaaaactgaaaatcagaagagcaactccacctagcagaagtctt'ttgcaaagggcga 

ggcgaggctaaaaagtatagaagagttcgtttccagtgcaattttataaacacagatggtccttaaattaagcaaatggt 

acctaaatgactgtgttgtggataatggtaacagagggagggactcgggggttttttaaaaagtactgattgtatgcagt 

gttttaaacagataactgtgatcttagtgtgatgaaagatgctgggagatttcaccagtggtatcttattatttttcggg 

gattttgtaattcaacaaaattctgttgtatgccaagcataaccctaggtgtgagagcacaaggtgacttcagataccac 

ctctatccttcaggggtttggggcccattattatgacttaatccattttgggcgtgagaagctgagggtcacagaaagaa 

ccaattccctctttaaataatgccaccccaaccctcctcatctgccaggtctttcccttcttctatttgtatgaataata 

gtcactttctcttgtggagttcgctaaattctactttggcctatcaaatttctttcatatcacaactaaatttcttaagg 

acgggactatggttcatttgtcagacgaacaaatgggaatttgccaagagacacttgggttaatttacgtcttttccatc 

caagggcactatgttgaagtgaggctagtaggtcatgagtgtggttgaagttactttttcttactttcccgaccagcccc 

catccttactgcacttaaagttgattgtccattttattaaatgtccccaggaagccagaacacagggcagtaaagtgctg 

aatgcaaagggcaggagaaaaatggaaacaaccagaactgtaacaccaaggaatgagacctgcatgtcagatatcatgcc 

cattgcactaagtgccattggggcacaattatcaaatggatgcattttccctagaaaaccatcttggagagcatgtggat 

gtacttctattttacatttccccctatttacaatcaatgagattgagattttgttgctgggactgctgatgatgggatgg 

gaaaatataatcaaggtaatggacatgaggcaaaaatttaaggaaatgacaaaaacaagagtatttccattttcagttaa 

gtgtatgtactgatgttctggaattcactataagaagttgcaaatggtgcatgaaatgaaaaattcctggtggtctccag 

gggacacagccggtgctgtgctccactctgggtaactgttttggattattttctctattccaactgaaataaaaaaaaat 

taattaaatgtggctaggttatcttgacagcagaatccattcccagttaattattattttaatacttgatggtgtctgtc 

aaattgtcgacatgtgacggtcctttcaaatttaaaggaatagctgatggtcactggccacccaagctgatactgatttt 

atatgttgatgtttctcattttatttgctctttccttgaatatttattcaggacattctctaccagacatattgagtaag 

ggcaacagaaacaatacataagtatcttataaatgtggaaaacaatgtatatgtgttttttatctctcaatgattggtgg 

gtaccatatccccaaagtagaatgagcatttgagaaaacaggaaatatcctcttttaggcaccatctctgtcaaggctga 

tgctgggctttttatatattttctctaattcttgtggctgtcaaacaaggtgggcattatcattccctttataggggaca 

cagctgtggctcagaggggtttattcactttcctgagggccacacacataatgagaggcagacacaggtgacgaagtgag 

ttttccctgtcacgccatcttatctgtcacatacctctctgacatgctaaaattgcactaaacaaaagaattctcttatg 

cacatatcatgcaaaagatattctttaactggggatcatgtttctcattccatcaatagaatgactaacattttctgagg 

gtgtctcacgtgaaagtaaatcgctcatgtttgttctttttaaaagatgcccttcgtattgtgtatcttgcagtcttgct 

ttctcaaacttaagccaactatatcgtcatttttgcaaaatcactgcgtcagtttactattatttaatgtttattgctac 

caattttaagaaatcctttataggactatttgtgaaattgattttgtgaggatgatgatataatttccattacattacag 

catataaatataaatatatatatatatatatatatatatatatatatatatatatatatatatattttattatttttttt 

tgagacggagtcttgctctgtcaccaggctggagtgcagtggtgcaatctcggctcactgtaacctccgccgcccgggtt 

caagcgattcccctgccttagcctcctgagtagctgggactacaggcatgtgctaccacacccagctaattttttgtatt 

ttagtagagatggtttcaccatgttggcgaggatggtctcagtctcctgacctcgtgatccgcctgccttggccttccaa 

agtgctgggatatacattttttttttttttttgagagatggagtgtcactctgttgccgaggctggagtgtagtggcgca 

atctcggctcactgcaacctcctcctccggggttcaagcgattctcctgcctcagcctccccagtagctgggattacagt 

cgtgtgccaccacgcctggttagtttttgtatttttagtagagatgggtttcactgtgttggctaaggtggtctcaaact 

catgacctcaagtgatccgcccgcttcagcctcccaaagtgctgggattacaggcgtcagccactgtgcccggccggata 

gaaataatttttataaactccttggatgctacctaaaatcatcttgttttgctagtggcacatgctgcattttgggcagc 

tgtggccttggtggattgctgaagtagatttgaccttacctggactgaggcagctgttgaagggaattgctgtgttcagt 

gtatactgccatccat^atttcatgaaaccagctctagctatttaagcaggggtcaaacttagaattctacattattttt 

ttcccttttctgggaggaaagacagttgaacaccagcaaagactaagaaatttcttagaagactgtgggtccttgggccc 

tttctattgaatttcagagtatttccaaatactatgaagtcttgcagcttagttgagaaatgccccagatggtgtgacat 

tctgcttccaggagggattggaaagtatttccttttacataacattccactcagctcattcctttgctgtgtctgaaatt 

gaatcccccaaagccacaattatcttaacattcagaagagtgtttatttaatctgcaaaatcttgcctcacttttgggga 

gcatgttaacaatttcacttacaaatcttctgtgtaactcaaccccatggtggtgtctactgctgctcctagactcttta 

aagcacctttctcatctcaggtttgaaatgatatgtctcattcttgggttccttgagtcgtaatgggtttgtcttgtctc 

cacagcataaatgactctttcttgatcaactagaaccacatcaacttcttccctccagcttcagtgatatattgtgaaac 

atggctattcaacgtcctgtagaccaaatgccataagaaaaatagcattgattcaaacgtatccatccagatacctaaaa 

aagttttacttcttaccacatcttgagtctgggcaaacacgcacttcctatggacattgattactgtctactgtagagat 

aacatttgcacatacagattatggcacatggtagaaagtgttaagtaatgtaggaatggacatatcccaagcaaaattgg 

aagccaagtcccctgtccctgctcaagttggtatgactggtgtatggtgccttaatgggtacttaaagtccaggtgagag 

tggcaggaggcagccaaatgcctaggtagataggagccggtccctgttgaaaccccacttccaagttgaagacagtttaa 



agactgaaagccaagctacaagttaaatcctcggaccagattgagaactcgtcttcttacttggtgcactcttctqatta 
atccccacctttcacctattttacatactcctgcccttccctaactggtttcccatgctgtcatgcccacctttgagtot 
tgccttcactctaaccttctgtgcatgctcacaaagtaattagcatgtaccctccattctgagttaatataaggccccaa 
acccagccacatggggcaactttactgccttcaggtaggggaaccacccccaccacattccctctccactgagagttttc 
cttttagttaataaattcggctccactcactctccattgtctgcatgcctaattcttcctggttgtgagacaagcagtta 
gacctagctgagctaaggagcagaaagactgtatcacagggaacttgtgtaacagcttgatctcctgtcctacgtagcta 
tctattggtaagaagttgaaggaacttgtgtcattccgttgtgcctgtcgtcttgaccttgtaaaaggtcttgggtaaqc 
atgcaagaagttttgaagagggagatacagctaatttgcagataaagagcaagggaagaattcctggagaaaggaagaqt 
ttcctgagtcacctttgggaggtaggaagggtttgacatgtaagctgggcatctgggaacgagtgagggattctgtgagc 
cccatctcagtggaccactcaaggaaggtgggtaagccctgggtaataagtgtgtaagcagggaacagaaagtactgtga 
tttaaaatatgttaatttttctaccgtacagatgagagccagcttggagatgggctgtagctcaagcatcttacctacct 
ctgatttcttaatgccacgttataaggctgctgcttatagctcttgaagtcactccaaaaacagatgagtgagaccctgt 
tgctaaagtcccaccgggtgtagattattcacagatgtatacacagtggctcactccaggtaggatgtgatcagtgcttt 
tagaaatacagaaagtcctattggtttaaaaaaaatttttttttgtaatgaattgagttttaaagctagcactgtacaat 
aaaagggtgaatttcactatgaattatgacaaacacagtcatagagctgccatcatcactgtcaagatacagaacagtgc 
catcaccccccaaatgtcccctgtgccttactgtagtcatacctgctcctgacacctagcccctgggaaccattaatctt 

f!^'^^^f^'*=''=r^^'^^"^^""==^"9=tgttgtaataaattaccacaaacttatgtgcttaaaacccaLgc 
tattatctgacagctctgtaggttaaagatttgtcatggttctcattgggctaaaattaaggtgtcggtatggctctgct 

r! ^^^^^''''"''^''''^''^''^^'''^^''^''''''''^•^'^'^'^^^ccacaatcgggaaaggatctcaggactgttgt 
gatgacactgtgcttacctagattatctagcatgagctccctgtctcaaggtgatagagtttgcatgtttttctcctqca 
catgccatgttgaaacataatccctagtgttggcggtgggtgtgctgggaggtatttggatcacgggggtggaaccctca 
tgcatgacttacggccatccctttggtgataagtgagttcacatgatatctggcaccttccttcctctgttg^ 

catgcttcttgtagagtctgcagaatcggagccaattaaatctcttttcattataaattatccagcctcagttacttctt 
^^^=r^f^T^*^''^^*'''''^^^''^''^^^^''''''''^'''''''''^9*'^'^=^99'^<=tgcagcatgtcttttaccatgtaaggt 
aatatgttcagtggctgtgggggttaggatgtggacttctttgggggacttttatttttcccagttactatttttqtqac 
tcaggaatttagggacagtttggctggttgtttctggctcagggtctttcttgggctgcaatcaagatgtcagctggggg 
ctgggcatggtggctcactcctgttatcacagcactttgggaggtcgaggtgagtggaccatttgaggttaggagtttga 

cagttagttggggggctaaggctggagaacagcttcaaaacaggaggcggaggttgtggtcaactgagatcacaccactg 
cactccagcctgggtgaaagagcaaggccccacctcaaaaaaaaaaaaaaaaaaaaaaagttagctggggccaaaatcat 

agctataattcctgactattgaccagaggcgaacctcagttccctgccatgtgggcctcctcatggggcagttgata^ca 
cagcagttagcttccattggattgagtaagcaagagagcaagaacaggagtgacacagaagccagcatctttttgtaaac 

ttfo^^^f ^^^^^^^^'^''^^^^^^^^^^^^''^^^^agcattgggagtcatttggaccctgcctaggacagtgcg 

^^^^^^^ .^^^^^^^^"^^''"^^^''^''^"^^"^^^^ggaaattttcttcagtgaaataatitLgagca 
aaaccttgaagaacaattaggaatttgacagagggaatggcacgaataaagacccagacttaatcaagtgaggggcgtac 

attaoaa^aotl^^T^r 

attagaatagtatttcaataagtttaactaggctttaaccagtattcataaaaagtcaagtgggagaatgtagaggtqac 
a^^^^^^ ^^-^''''^^^'''^''^^^''^^^^^"'^'^'^^ttcctatgatcgatt^gggga^ 

tgtatcaccgagttggtcctccttgtcaacagagcatgtgtgtgggggtgatgacttcccagatgtgagggtgccctcca 
tttaggggttcctaatgagcctagagatactgacattgaccagtcatggagactaaaggagaagatgggatagcacattg 



